Wednesday, March 23, 2005


Some notes on Thread - 2


Thread Local Storage - The C++ Way
By Roland Schwarz

Global data, while usually considered poor design, nevertheless often is a useful means to preserve state between related function calls. When it comes to using threads, the issue unfortuantely is complicated by the fact that some access synchronisation is needed, to avoid that more than one thread will modify the data.

There are times when you will want to have a globally visible object, while still having the data content accessible only to the calling thread, without holding off other threads that contend for the "same" global object. This is where thread local storage (TLS) comes in. TLS is something the operating system / threading subsystem provides, and by its very nature is rather low level.

From a globally visible object (in C++) you expect that its constructors are getting called before you enter "main", and that it is disposed properly, after you exit from "main". Consequently one would expect a thread local "global" object beeing constructed, when a thread starts up, and beeing destroyed when the thread exits. But this is not the case! Using the native API one can only have TLS that needs neither code to construct nor code to destruct.

While at first glance this is somewhat disappointing, there are reasons, not to automatically instantiate all these objects on every thread creation. A clean solution to this problem is presented e.g. in the "boost" library. Also the standard "pthread" C library addresses this problem properly. But when you need to use the native windows threading API, or need to write a library that, while making use of TLS, has no control over the threading API the client code is using, you are apparently lost.

Fortunately this is not true, and this is the topic of this article. The Windows Portable Executable (PE) format provides for support of TLS-Callbacks. Altough the documentation is hard to read, it can be done with current compilers i.e. MSVC 6.0,7.1,... Since noone else seemingly was using this feature before, and not even the C runtime library (CRT) is making use of it, you should be a little careful and watch out for undesired behaviour. Having said, that the CRT does not use it, does not mean it does not implement it. Unfortunately there is a small bug present in the MSVC 6.0 implementation, that is also worked-around by my code. (按:这里指的是PE文件的.tls 线程的本地存储器section,参见7/20-21/2004的blog)

If it turns out, that the concepts, presented in this article, prove to be workable in "real life", I would be glad if this article has helped to remove some dust from this topic and make it usable for a broader range of applications. I could e.g. think of a generalized atexit_thread function that makes use of the concepts presented here.

Before going to explain the gory details, I want to mention Aaron W. LaFramboise who made me aware of the existence of the TLS-Callback mechanism.

Using the code
If you are using the precompiled binaries, you simply will need to copy the *.lib files to a convenient directory where your compiler usually will find libraries. So you will copy the files from the include directory to a directory where your compiler searches for includes. Alternatively you may simply copy the files to your project directory.

The following is a simple demonstration of usage, to get you started.

// first include the header file

// this is your class
struct A {
A() : n(42) {
~A() {
int the_answer_is() {
int m = n;
n = 0;
return m;
int n;

// now define a tls wrapper of class A
tls_ptr< A > pA;

// this is the threaded procedure
void run(void*)
// instantiate a new "A"
pA.reset(new A);

// access the tls-object
ans = pA->the_answer_is();

// note, that we do not need to deallocate
// the object. This is getting done automagically
// when the thread exits.

int main(int argc, char* argv[])
// the main thread also gets a local copy of the tls.
pA.reset(new A);

// start the thread
_beginthread(&run, 0, 0);

// call into the main threads version

// the "run" thread should have ended when we
// are exiting.

// again we do not need to free our tls object.
// this is comparable in behaviour to objects
// at global scope.
return 0;
While at first glance it might appear natural that the tls-objects should not be wrapped as pointers, in fact it is not. While the objects are globally visible, they are still "delegates" that forward to a thread local copy. The natural way in C++ to express delegation is a pointer object. (The technical reason of course is, that you cannot overload the "." operator but "->" can be overloaded.)

You can use this mechanism when building a "*.exe" file of course, but you also can use it when building a "*.dll" image. However when you are planning to load your DLL by LoadLibary() you should define the macro TLS_ALLOC when building your DLL. This is not necessary when using your DLL by means of an import library. A similar restriction applies when delay-loading your DLL. Please consult your compiler documentation when you are interested in the reasons for this. (Defining TLS_ALLOC forces the use of the TlsAlloc() family functions from the Win32 API.) (按:这里可以参见Under the Hood 这篇大作,参见10/30-31/2003的blog)

The complete API is kept very simple:

tls_ptr< A > pA; // declare an object of class A
pA.reset(new A); // create a tls of class A when needed
pA.reset(new A(45)); // create a tls of class A with a custom constructor
// note, that this also deletes any prior objects
// that might have been allocated to pA
pA.release(); // same as pA.reset(0), releases the thread local object
A& refA = *pA; // get a temporary reference to the contained object for faster access
pA->the_answer_is(); // access the object
Please again note, that it is not necessary to explicitely call the destructors of your class (or release()). This is very handy, when you are writing a piece of code, that has no control over the calling threads, but must still be multithread safe. One caveat however: The destructors of your class are called _after_ the CRT code has ended the thread. Consequently when you are doing something fancy in your destructors, which causes the CRT to reallocate its internal thread local storage pointers, you will be left with a small memory leak of the CRT. This is comparable in effect to the case when you are using the native Win32 API functions to create a thread, instead of _beginthread().

In principle that is all you need. But wait! I mentioned a small bug in the version 6 of the compiler. Luckily it is easy to work around. I provided an include file tlsfix.h which you will need to include into your program. You need to make sure it is getting included before windows.h. To be more precise: the TLS library must be searched before the default CRT library. So you alternatively may specify the library on the command line on the first place, and omit the inclusion of tlsfix.h.

I will not discuss the user interface in this place. It suffices to say, that it essentialy is the same as in the boost library. However I omitted the feature of beeing able to specify arbitrary deleter functions, since this would have raised the need to include the boost library in my code. I wanted to keep it small and just demonstrate the principles. However, my implementation also deviates from boost insofar as I am featuring native compiler support for TLS variables, thus gaining an almost 4 times speed improvement. No need to say, that my implementation of course is Windows specific.

When thinking about TLS for C++ the main question is how to run the constructors and destructors. A careful study of the PE format (e.g. in the MSDN library) reveals, that it almost ever provided for TLS support. (Thanks again to Aaron W. LaFramboise who read it carefully enough.) Of special interest is the section about TLS-Callback:

The program can provide one or more TLS callback functions (though Microsoft
compilers do not currently use this feature) to support additional
initialization and termination for TLS data objects. A typical reason to use
such a callback function would be to call constructors and destructors for
Well it is true, that the compilers do not use the feature, but there is nothing that prevents user code to use it though. One somehow must convince the compiler (to be honest it is the linker) to place your callback in a manner, so the operating system will call it. It turns out, that this is surprisingly simple (omitting the deatils for a moment).

// declare your callback
void NTAPI on_tls_callback(PVOID h, DWORD dwReason, PVOID pv)
if( DLL_THREAD_DETACH == dwReason )

// put a pointer in a special segment
#pragma data_seg(".CRT$XLB")
PIMAGE_TLS_CALLBACK p_thread_callback = on_tls_callback;
#pragma data_seg()
You can even add more callbacks, by appending pointers to the ".CRT$XLB" segment. The fancy definitions are available from the windows.h and winnt.h include files in turn.

Now about the details: You will find at times, that your callbacks are not getting called. The reason for this is when the linker does not correctly wire up your segments. It turns out, that this coincides with when you are not using any __declspec(thread) in your code. A further study of the PE format description reveals:

The Microsoft run-time library facilitates this process by defining a memory
image of the TLS Directory and giving it the special name “__tls_used” (Intel
x86 platforms) or “_tls_used” (other platforms). The linker looks for this
memory image and uses the data there to create the TLS Directory. Other
compilers that support TLS and work with the Microsoft linker must use this same
Consequentyly, when the linker does not find the _tls_used symbol it won't wire in your callbacks. Luckily this is easy to circumvent:

#pragma comment(linker, "/INCLUDE:__tls_used")
This will pull in the code from CRT that manages TLS. When using a version 7 compiler, that is all you need. (Actually I tried this with 7.1.) It turns out, however that using a version 6 compiler does not work. But the operating system cannot be the culprit, since code compiled by version 7 does work properly. After a little guess-work you will find out, that the CRT code from version 6 is slightly broken, because it inserts a wrong offset to the callback table. It is easy then to replace the errenous code and convince the linker to wire in the work around before the broken version from the CRT. You can study the tlsfix.c file from my submission, if you are interested in the details.

Points of Interest
Which is the first function of your program that is getting called by the operating system? Of course it is not main(). This was easy. Then mainCRTStartup specified as the entry-point in the linker comes to mind. Wrong again. Interestingly the first function beeing called is the Tls-Callback with Reason == DLL_PROCESS_ATTACH. But wait. Don't rely on this. This is not true on WinXP. I observed this on Win2000 only.

I did not yet try the code on Win95/98, WinXP-Home-Edition and Win2003. I would be interested on feedback about using this code on these platforms. In principle it should work, because it is a feature of PE and not the operating system, but ...



On Sun, 01 Aug 2004 16:41:18 -0500 "Aaron W. LaFramboise" wrote:

> Just as a FYI, I now have a copy of MSVC6, and am working on this.
> MSVC6 does, in fact, have the necessary support, but there is a bug (I
> had noticed this before, and this was one of the reasons I wasn't able
> to offer more information a few months ago, and I had entirely forgotten
> about it. Oops.). Fortunately, the bug is in the runtime library, not
> in the linker or anything else.

Yes the bug is, that the TLS handlers must be in a contiguous area
between the __xl_a and __xl_z symbols. I fixed this by running a small
piece of code during the startup (in __xi_a .. __xi_z area).

Finally I wrapped everything up into a small C file that either can be bound
to boost or be linked with the user application. Despite now having everything
in a single file, I think boost still should not give away the possibility of
letting the user code call the process/thread startup/termination hooks
directly. There always might be some code that needs this.

Thanks to Aaron now now have a TLS solution that can handle any thread
creation mechansim, while still reside in a statically bound library.

The tsstls.c file follows: To test it compile your application with BOOST_THREAD_USE_LIB
Boost Software License - Version 1.0 - August 17th, 2003

Permission is hereby granted, free of charge, to any person or organization
obtaining a copy of the software and accompanying documentation covered by
this license (the "Software") to use, reproduce, display, distribute,
execute, and transmit the Software, and to prepare derivative works of the
Software, and to permit third-parties to whom the Software is furnished to
do so, all subject to the following:

The copyright notices in the Software and this entire statement, including
the above license grant, this restriction and the following disclaimer,
must be included in all copies of the Software, in whole or in part, and
all derivative works of the Software, unless such copies or derivative
works are solely in the form of machine-executable object code generated by
a source language processor.


This piece of code is a result of the work of:
Aaron W.LaFramboise, who showed how to implement TLS-callback
Michael Glassford, who factored out the startup code
Bronek Kozicki, who showed me, that it is not harmful
to access the CRT after thread end
Roland Schwarz, who did the writing, runtime initialization
(.CRTXxx), correct dtor behaviour and broken MSVC 6 fix


typedef void (__cdecl *_PVFV)(void);

/* some symbols for connection to the runtime environment */
extern IMAGE_TLS_DIRECTORY _tls_used; /* the tls directory (located in .rdata segment) */
extern _TLSCB __xl_a[], __xl_z[]; /* tls initializers */

/* the boost tss startup interface */
extern void on_process_enter(void);
extern void on_process_exit(void);
extern void on_thread_exit(void);

/* some forward declarations */
static void on_tls_prepare(void);
static void on_process_init(void);
static void NTAPI on_thread_callback(HINSTANCE, DWORD, PVOID);

/* The .CRT$Xxx information is taken from Codeguru: */
/* */

/* The tls glue code is to be run first */
/* I don't think it is necessary to run it */
/* at .CRT$XIB level, since we are only */
/* interested in thread detachement. But */
/* this could be changed easily if required. */
#pragma data_seg(".CRT$XIU")
static _PVFV p_tls_prepare = on_tls_prepare;
#pragma data_seg()

/* we need to get control after all global ctors */
#pragma data_seg(".CRT$XCU")
static _PVFV p_process_init = on_process_init;
#pragma data_seg()

/* this is the TLS callback */
#pragma data_seg(".CRT$XLB")
_TLSCB p_thread_callback = on_thread_callback;
#pragma data_seg()

/* we will run the termination late */
#pragma data_seg(".CRT$XTU")
static _PVFV p_process_exit = on_process_exit;
#pragma data_seg()

static void on_tls_prepare(void)
_TLSCB* pfbegin;
_TLSCB* pfend;
_TLSCB* pfdst;
pfbegin = __xl_a;
pfend = __xl_z;
/* the following line has an important side effect: */
/* if the TLS directory is not already there, it will */
/* be created by the linker. (_tls_used) */
pfdst = (_TLSCB*)_tls_used.AddressOfCallBacks;
/* the following loop will merge the address pointers */
/* into a contiguous area, since the tlssup code seems */
/* to require this (at least on MSVC 6) */
while (pfbegin < pfend) {
if (*pfbegin != 0) {
*pfdst = *pfbegin;

static void on_process_init(void)
/* This hooks the main thread exit. It will run the */
/* termination before global dtors, but will not be run */
/* when 'quick' exiting the library! However, this is the */
/* standard behaviour for all global dtors anyways. */

/* hand over to boost */

void NTAPI on_thread_callback(HINSTANCE h, DWORD dwReason, PVOID pv)
if(dwReason == DLL_THREAD_DETACH)

void tss_cleanup_implemented(void) {};




“I don't think any POSIX standard covers TLS as yet. However, it's covered by an amendment to C99 and C++98, so it looks like it's going to be standard across all compliant implementations. From the GCC manual:

5.48.1 ISO/IEC 9899:1999 Edits for Thread-Local Storage

The following are a set of changes to ISO/IEC 9899:1999 (aka C99) that
document the exact semantics of the language extension.

* `5.1.2 Execution environments'

Add new text after paragraph 1

Within either execution environment, a "thread" is a flow of
control within a program. It is implementation defined
whether or not there may be more than one thread associated
with a program. It is implementation defined how threads
beyond the first are created, the name and type of the
function called at thread startup, and how threads may be
terminated. However, objects with thread storage duration
shall be initialized before thread startup.

* `6.2.4 Storage durations of objects'

Add new text before paragraph 3

An object whose identifier is declared with the storage-class
specifier `__thread' has "thread storage duration". Its
lifetime is the entire execution of the thread, and its
stored value is initialized only once, prior to thread

* `6..1 Keywords'

Add `__thread'.

* `6.7.1 Storage-class specifiers'

Add `__thread' to the list of storage class specifiers in
paragraph 1.

Change paragraph 2 to

With the exception of `__thread', at most one storage-class
specifier may be given [...]. The `__thread' specifier may
be used alone, or immediately following `extern' or `static'.

Add new text after paragraph 6

The declaration of an identifier for a variable that has
block scope that specifies `__thread' shall also specify
either `extern' or `static'.

The `__thread' specifier shall be used only with variables.

5.48.2 ISO/IEC 14882:1998 Edits for Thread-Local Storage

The following are a set of changes to ISO/IEC 14882:1998 (aka C++98)
that document the exact semantics of the language extension.

* [intro.execution]

New text after paragraph 4

A "thread" is a flow of control within the abstract machine.
It is implementation defined whether or not there may be more
than one thread.

New text after paragraph 7

It is unspecified whether additional action must be taken to
ensure when and whether side effects are visible to other

* [lex.key]

Add `__thread'.

* [basic.start.main]

Add after paragraph 5

The thread that begins execution at the `main' function is
called the "main thread". It is implementation defined how
functions beginning threads other than the main thread are
designated or typed. A function so designated, as well as
the `main' function, is called a "thread startup function".
It is implementation defined what happens if a thread startup
function returns. It is implementation defined what happens
to other threads when any thread calls `exit'.

* [basic.start.init]

Add after paragraph 4

The storage for an object of thread storage duration shall be
statically initialized before the first statement of the
thread startup function. An object of thread storage
duration shall not require dynamic initialization.

* [basic.start.term]

Add after paragraph 3

The type of an object with thread storage duration shall not
have a non-trivial destructor, nor shall it be an array type
whose elements (directly or indirectly) have non-trivial

* []

Add "thread storage duration" to the list in paragraph 1.

Change paragraph 2

Thread, static, and automatic storage durations are
associated with objects introduced by declarations [...].

Add `__thread' to the list of specifiers in paragraph 3.

* []

New section before []

The keyword `__thread' applied to a non-local object gives the
object thread storage duration.

A local variable or class data member declared both `static'
and `__thread' gives the variable or member thread storage

* []

Change paragraph 1

All objects which have neither thread storage duration,
dynamic storage duration nor are local [...].

* []

Add `__thread' to the list in paragraph 1.

Change paragraph 1

With the exception of `__thread', at most one
STORAGE-CLASS-SPECIFIER shall appear in a given
DECL-SPECIFIER-SEQ. The `__thread' specifier may be used
alone, or immediately following the `extern' or `static'
specifiers. [...]

Add after paragraph 5

The `__thread' specifier can be applied only to the names of
objects and to anonymous unions.

* [class.mem]

Add after paragraph 6

Non-`static' members shall not be `__thread'.

例如,FreeBSD对Thread Local Storage (TLS)的支持可以从下表清楚的看出

Peering Inside the PE: A Tour of the Win32 Portable Executable File Format
by Matt Pietrek


When you use the compiler directive _ _declspec(thread), the data that you define doesn't go into either the .data or .bss sections. It ends up in the .tls section, which refers to "thread local storage," and is related to the TlsAlloc family of Win32 functions. When dealing with a .tls section, the memory manager sets up the page tables so that whenever a process switches threads, a new set of physical memory pages is mapped to the .tls section's address space. This permits per-thread global variables. In most cases, it is much easier to use this mechanism than to allocate memory on a per-thread basis and store its pointer in a TlsAlloc'ed slot.

There's one unfortunate note that must be added about the .tls section and _ _declspec(thread) variables. In Windows NT and Windows 95, this thread local storage mechanism won't work in a DLL if the DLL is loaded dynamically by LoadLibrary. In an EXE or an implicitly loaded DLL, everything works fine. If you can't implicitly link to the DLL, but need per-thread data, you'll have to fall back to using TlsAlloc and TlsGetValue with dynamically allocated memory.

Although the .rdata section usually falls between the .data and .bss sections, your program generally doesn't see or use the data in this section. The .rdata section is used for at least two things. First, in Microsoft linker-produced EXEs, the .rdata section holds the debug directory, which is only present in EXE files. (In TLINK32 EXEs, the debug directory is in a section named .debug.) The debug directory is an array of IMAGE_DEBUG_DIRECTORY structures. These structures hold information about the type, size, and location of the various types of debug information stored in the file. Three main types of debug information appear: CodeView, COFF, and FPO.

更为详细的介绍可以参见著名的The Programming Applications for Microsoft Windows book published by Microsoft Press。其21节专门讨论Thread Local Storage。NOTE: This book was formerly titled Advanced Windows。




CFrameWnd* CWnd::GetParentFrame() const
if (GetSafeHwnd() == NULL) // no Window attached
return NULL;


CWnd* pParentWnd = GetParent(); // start with one parent up
while (pParentWnd != NULL)
if (pParentWnd->IsFrameWnd())
return (CFrameWnd*)pParentWnd;
pParentWnd = pParentWnd->GetParent();
return NULL;

_AFXWIN_INLINE CWnd* CWnd::GetParent() const
{ ASSERT(::IsWindow(m_hWnd)); return CWnd::FromHandle(::GetParent(m_hWnd)); }

看到了么,它首先调用API GetParent,然后去本线程的窗口<->句柄映射查找对象指针,然后调用CWnd::IsFrameWnd来决定对象是否是框架。(谢天谢地,这个函数是用虚函数而不是用CObject::IsKindOf,不然又得遍历一遍运行时类信息)


AfxWndProc(HWND hWnd, UINT nMsg, WPARAM wParam, LPARAM lParam)
// special message which identifies the window as using AfxWndProc
return 1;

// all other messages route through message map
CWnd* pWnd = CWnd::FromHandlePermanent(hWnd);
ASSERT(pWnd->m_hWnd == hWnd);
if (pWnd == NULL || pWnd->m_hWnd != hWnd)
return :efWindowProc(hWnd, nMsg, wParam, lParam);
return AfxCallWndProc(pWnd, hWnd, nMsg, wParam, lParam);

同样的,由于这些对象是被线程所拥有的,MFC的这些句柄映射的存储方式是线程局部存储(thread-local-storage ,TLS)。也就是说,对于同一个句柄,句柄映射中相应的对象可以不一致。这在多线程程序中会造成一些问题,参见微软知识库文章Q147578 CWnd Derived MFC Objects and Multi-threaded Applications;EN-US;147578

Microsoft编译了一个所有可能的错误代码列表,并且为每个错误代码分配了一个32位的号码. WinError.h头文件(大约2万多行)包含了Micorsoft定义的错误代码列表. 当一个Windows函数检测到一个错误时,它会使用线程本地存储(thread-local storage)机制,将相应的错误代码号码与调用的线程关联起来.这将使线程能够互相独立的运行,而不会影响各自的错误代码.








CWnd::FromHandle( HWND hWnd
--> CHandleMap* pMap = afxMapHWND(TRUE); // create map if not exist
--> CWnd* pWnd = (CWnd*)pMap->FromHandle(hWnd);

CHandleMap* PASCAL afxMapHWND(BOOL bCreate)
AFX_MODULE_THREAD_STATE* pState = AfxGetModuleThreadState();
if (pState->m_pmapHWND == NULL && bCreate)
// 创建一个新的CHandleMap
return pState->m_pmapHWND; // CHandleMap* m_pmapHWND

--> return AfxGetModuleState()->m_thread.GetData();
--> _AFX_THREAD_STATE* pState = _afxThreadState;// AFX_DATADEF CThreadLocal<_AFX_THREAD_STATE> _afxThreadState
--> return pState->m_pModuleState;

而m_thread的定义为:AFX_DATADEF CThreadLocal m_thread,它也是线程本地存储。 继续跟踪CThreadLocal的话,可以看到线程本地存储的标志:TlsAlloc和TlsFree。

CHandleMap::FromHandle(HANDLE h)
--> CObject* pObject = LookupPermanent(h); // 从Permanent映射表中取该句柄对应的真正的窗口指针
--> return (CObject*)m_permanentMap.GetValueAt((LPVOID)h);
--> 若pObject不为空,则返回,否则
pObject = LookupTemporary(h); // 从Temporary映射表中取该句柄对应的临时窗口指针
--> 若pObject不为空,则返回,否则
// This handle wasn't created by us, so we must create a temporary C++ object to wrap it.




--> _AfxCbtFilterHook
--> CWnd::Attach
--> CHandleMap* pMap = afxMapHWND(TRUE);
pMap->SetPermanent(m_hWnd = hWndNew, this);



by duyanning


可使用宏THREAD_LOCAL(class_name, ident_name)定义线程局部数据,THREAD_LOCAL定义如下:

#define THREAD_LOCAL(class_name, ident_name) ?AFX_DATADEF CThreadLocal ident_name;


struct CMyThreadData : public CNoTrackObject
?? CString strThread;

THREAD_LOCAL(CMyThreadData, threadData)
CThreadLocal threadData;




首先,我们注意到rotor代码中,CLR loader并没有检测TLS可能存在的callback。
中的PEVerifier::CheckDirectories() 方法只有

static DWORD s_dwAllowedBitmap =

CLR这么做当然是从使用的安全性出发。那么,DotNet中,我们怎么设置Thread的本地化呢?答案是ThreadStaticAttribute class。其属于System域名空间,派生方式是



namespace System {

using System;

[AttributeUsage(AttributeTargets.Field, Inherited = false),Serializable()]
public class ThreadStaticAttribute : Attribute
public ThreadStaticAttribute()


Static Fields
By default, static fields are scoped to AppDomains. In other words, each AppDomain gets its own copy of all the static fields for the types that are loaded into that AppDomain. This is independent of whether the code was loaded as domain-neutral or not. Loading code as domain neutral affects whether we can share the code and certain other runtime structures. It is not supposed to have any effect other than performance.

Although per-AppDomain is the default for static fields, there are 3 other possibilities:

RVA-based static fields are process-global. These are restricted to scalars and value types, because we do not want to allow objects to bleed across AppDomain boundaries. That would cause all sorts of problems, especially during AppDomain unloads. Some languages like ILASM and MC++ make it convenient to define RVA-based static fields. Most languages do not.

Static fields marked with System.ThreadStaticAttribute are scoped per-thread per-AppDomain. You get convenient declarative thread-local storage over and above the normal per-AppDomain cloning of static fields.

Static fields marked with System.ContextStaticAttribute are scoped per-context per-AppDomain. If you are using managed contexts and ContextBoundObject, this is a convenient way to get storage cloned in each managed context.

We considered (briefly) building thread-relative and context-relative versions of the existing .cctor class constructor. But that’s a lot of machinery to ensure that all static fields are initialized via a constructor that is coordinated by the system.

Instead, our docs recommend against initializing your thread-relative and context-relative static fields in a .cctor. The reason is that a .cctor executes only once per AppDomain. The static fields will get initialized in whatever thread and context the .cctor happens to run in. But all subsequent threads and contexts will have uninitialized data.

So the model you have today is that you should be prepared to initialize your thread-relative and context-relative statics on first use. This is fairly easy to do since we guarantee these statics are first initialized to 0. So you can use a thread-relative or context-relative static Boolean field (inited to false) or static Object reference (inited to null) to indicate that initialization hasn’t occurred yet.

JGTM'2004在 说道

  TLS(线程局部存储,thread local storage)在类库和多线程应用开发中是个有用的东东,在很多语言和工具中都有很好的支持(如Visual C++里面的__declspec(thread),Delphi中的threadvar等等,Win32 API中也有对应的Tls族函数)。有些刚接触.NET的朋友就开始抱怨了,说在管制环境下没有TLS了,得自己写了。其实不然,虽然在C#/VB.NET等语言中没有直接的关键字或语句来声明TLS,但是CLR通过定制属性更直观的支持着这一特性,这个属性就是ThreadStaticAttribute。

  如果你希望一个静态成员(static in C#, Shared in VB.NET)对于不同的线程(更准确的说,app-domain与线程的组合)有不同的值(也即TLS的行为),则只需要为其设置ThreadStatic属性就可以了,无需作任何编程处理(当然这是declarative的做法,相应的programmatic方法也有,具体的可以参见Thread.AllocateDataSlot和Thread.AllocateNamedDataSlot方法,或检索.NET SDK Documentation Index中的TLS条目)。




ThreadStaticAttribute 的作用是告诉CLR,它标记的静态字段的存取是依赖当前线程,而独立于其他线程的。


class MyClass{
[ThreadStatic] static public string threadvalue;

MyClass 中的threadvalue就是一个线程静态字段 。 如果一个程序中同时有多个线程同时访问这个字段,则每个线程访问的都是独立的threadvalue 。例如线程1设置它为”hello”,然后线程2设置它为”world”,最后线程1读取它的时候,得到的是”hello”。


它是静态的字段。所以不需要MyClass的实例,直接用 MyClass.threadvalue的形式来访问就可以了。
一条线程不可能访问到另外一条线程上的线程静态字段。就算你得到另外一条线程的System.Threading.Thread 对象的引用也不行。

如果你知道 System.Runtime.Remoting.Messaging.Context (以下简称MContext)

ThreadStatic 是CLR内部实现的。而 MContext 是附属在 System.Threading.Thread 对象的一个字典。
MContext被叫做逻辑线程上下文数据。它的数据会在异步调用的时候复制到另外一条线程中。而线程静态字段是不会被复制的。(例如 eventHandlerInst.BeginInvoke(...)时,在新的线程中,就拥有原线程上MContext的数据。在eventHandlerInst.EndInvoke执行时,新线程上的MContext上的数据就会还原到调用EndInvoke的线程上.这个在以后讲到Remoting时会详细说)
System.Web.HttpContext.Current 是用 MContext 实现的。


你可能需要模仿HttpContext,弄个MyClass.Current , 或者根据Singleton模式,弄个ThreadSingleton.你可能会有工作线程。你以往是通过线程上传递一个对象用于共享数据的。这样所有的对象啊,方法啊都变得怪怪的。现在你可以把这些数据直接以[ThreadStatic]的形式存取了。


public class ConnHolder : IDisposable
[ThreadStatic] SqlConnection threadconn;
bool createbyme=false;
public ConnHolder()
threadconn=new SqlConnection(Config.ConnectionString);
public SqlConnection Connection
SqlConnection conn=threadconn;
return conn;
public void Dispose()

using(ConnHolder ch=new ConnHolder())
using(SqlCommand cmd=new SqlCommand(ch.Connection))


最后是Brian Grunkemeyer 的一个预测(的确,我们只能说是预测)

Brian Grunkemeyer
.NET Framework Base Class Library team

In V1 and V1.1, each logical thread corresponded with exactly one physical
thread in the OS. We also provide a Threadpool used for async IO
operations, async delegate invokations, and to process any requests that go through QueueUserWorkItem. The choice of GC doesn't limit the number of threads in an application to one processor or anything like that. It may affect your scalability and performance though.

Note that in version 2, in some hosted enviroments like SQL Server,
multiple logical threads may be multi-tasked on the same physical thread,
using fibers. We might possibly add multiple finalizer threads in a future
version as well. Rest assured that we don't limit all your threads to one
processor (unless you set the thread's affinity), and don't make many
assumptions in your code about exactly which OS thread is running your
code. You can safely use Thread's AllocateDataSlot and
AllocateNamedDataSlot if you need a managed equivalent to thread-local

If you have any questions about the GC implementation, please read chapter 19 in Jeffrey Richter's "Applied Microsoft .NET Framework Programming". Understanding when to use finalizers & the dispose pattern and being able to use a managed memory profiler on your code will almost certainly give you a better return for your time than trying to figure out whether to use the server or workstation GC.


看来一时半会掌握这些就够了。Brian Grunkemeyer的blog中说微软即将开发的Threading的新功能是

Semaphore class which was a missing functionality in the framework.
Named Events to enhance the cross-process communication.
Abandoned mutex detection.


Gheorghe Marius asked this question:

Question: Are there any plans to add support for memory mapped files in Whidbey ? If no...why ?

The answer is no Gheorghe. The why comes down to priorities, there simply wasn't enough interest in the feature at the point we were deciding what makes it into Whidbey, and what doesn't. We have become aware recently of the demand for this item, and we will be exploring it for the near future after Whidbey.


<< Home

This page is powered by Blogger. Isn't yours?