Sunday, July 11, 2004
Some COM interview questions
What is IUnknown? What methods are provided by IUnknown? It is a generally good idea to have an answer for this question if you claim you know COM in your resume. Otherwise, you may consider your interview failed at this point. IUnknown is the base interface of COM. All other interfaces must derive directly or indirectly from IUnknown. There are three methods in that interface: AddRef, Release and QueryInterface.
What are the purposes of AddRef, Release and QueryInterface functions? AddRef increments reference count of the object, Release decrements reference counter of the object and QueryInterface obtains a pointer to the requested interface.
What should QueryInterface functions do if requested object was not found? Return E_NOINTERFACE and nullify its out parameter.
How can would you create an instance of the object in COM? Well, it all depends on your project. Start your answer from CoCreateInstance or CoCreateInstanceEx, explain the difference between them. If interviewer is still not satisfied, you’ll have to explain the whole kitchen behind the scenes, including a difference between local server and inproc server, meaning and mechanism of class factory, etc. You may also mention other methods of object creation like CoGetInstanceFromFile, but discussion will likely turn to discussion of monikers then.
What happens when client calls CoCreateInstance? Again, all depends on the level of detail and expertise of interviewer. Start with simple explanation of class object and class factory mechanism. Further details would depend on a specific situation.
What the limitations of CoCreateInstance? Well, the major problems with CoCreateInstance is that it is only able to create one object and only on local system. To create a remote object or to get several objects, based on single CLSID, at the same time, one should use CoCreateInstanceEx.
What is aggregation? How can we get an interface of the aggregated object? Aggregation is the reuse mechanism, in which the outer object exposes interfaces from the inner object as if they were implemented on the outer object itself. This is useful when the outer object would always delegate every call to one of its interfaces to the same interface in the inner object. Aggregation is actually a specialized case of containment/delegation, and is available as a convenience to avoid extra implementation overhead in the outer object in these cases. We can get a pointer to the inner interface, calling QueryInterface of the outer object with IID of the inner interface.
C is aggregated by B, which in turn aggregated by A. Our client requested C. What will happen? QueryInterface to A will delegate request to B which, in turn, will delegate request for the interface to C. This pointer will be returned to the client.
What is a moniker ? An object that implements the IMoniker interface. A moniker acts as a name that uniquely identifies a COM object. In the same way that a path identifies a file in the file system, a moniker identifies a COM object in the directory namespace.
What’s the difference, if any, between OLE and COM? OLE is build on top of COM. The question is not strict, because OLE was built over COM for years, while COM as a technology was presented by Microsoft a few years ago. You may mention also that COM is a specification, while OLE is a particular implementation of this specification, which in today’s world is not exactly true as well, because what people call COM today is likely implementation of COM spec by Microsoft.
What’s the difference between COM and DCOM? Again, the question does not require strict answer. Any DCOM object is yet a COM object (DCOM extends COM) and any COM object may participate in DCOM transactions. DCOM introduced several improvements/optimizations for distributed environment, such as MULTI_QI (multiple QueryInterface()), security contexts etc. DCOM demonstrated importance of surrogate process (you cannot run in-proc server on a remote machine. You need a surrogate process to do that.) DCOM introduced a load balancing.
What is a dual interface? Dual interface is one that supports both - IDispatch interface and vtbl-based interface. Therefore, it might be used in scripting environment like VBScript and yet to use power and speed of vtbl-based interface for non-scripting environment. Discussion then may easily transform into analyzing of dual interface problems - be prepared to this twist.
Can you have two dual interfaces in one class? Yes. You may have two dual interfaces in one class, but only one of them may be default. The bottom line is that you cannot work with two dual interfaces at the same time due to nature of dual interface! To support two dual interfaces in VB you would write something like:
dim d1 as IDualInterface1
dim d2 as IDualInterface2
set d1 = new MyClassWithTwoDuals
set d2 = d1
In ATL’s class you would have to use macro COM_INTERFACE_ENTRY2(IDispatch,
IDualInterface1), to distinguish between different dual interfaces.
What is marshalling by value? Some objects can essentially be considered static: regardless of which methods are called, the state of the object does not change. Instead of accessing such an object remotely, it is possible to copy the static state of the object and create a new object with the same state information on the caller side. The caller won’t be able to notice the difference, but calls will be more efficient because they do not involve network round trips. This is called “marshaling by value".
What is a multi-threaded apartment (MTA)? Single-threaded apartment (STA)? This is pretty difficult question to describe shortly. Anyway, apartments were introduced by Microsoft in NT 3.51 and late Windows 95 to isolate the problem of running legacy non-thread safe code into multithreaded environment. Each thread was “encapsulated” into so called single-threaded apartment. The reason to create an object in apartment is thread-safety. COM is responsible synchronize access to the object even if the object inside of the apartment is not thread-safe. Multithreaded apartments (MTA, or free threading apartment) were introduced in NT 4.0. Idea behind MTA is that COM is not responsible to synchronize object calls between threads. In MTA the developer is responsible for that. See “Professional DCOM Programming” of Dr. Grimes et al. or “Essential COM” of Don Box for the further discussion on this topic.
Let’s assume we have object B and aggregated object C (in-proc server), created by B. Can you access any interface of B from C? What’s the difference between aggregated and contained objects? Yes, you can. This is fundamental postulate of COM: “If you can get there from here, you can get there from anywhere", i.e. QI’ing for IUnknown you may proceed and to get a pointer to any other interface, supported by the object. Aggregated object exposes its interface directly, without visible intervention of the object container. Contained object is created within the object container and its interfaces might be altered or filtered by the object container.
What is ROT ? GIT ? Count pros and cons of both. By definition, running object table (ROT) is a globally accessible table on each computer that keeps track of all COM objects in the running state that can be identified by a moniker. Moniker providers register an object in the table, which increments the object’s reference count. Before the object can be destroyed, its moniker must be released from the table. Global Interface Table (GIT) allows any apartment (either single- or multi-threaded) in a process to get access to an interface implemented on an object in any other apartment in the process.
If you have an object with two interfaces, can you custom marshal one of them? No! The decision to use custom marshaling is an all-or-nothing decision; an object has to custom marshal all its interfaces or none of them.
Is there a way to register in-proc server without regsvr32.exe? Yes. Call DllRegisterServer() from the client. Do not forget to call DLLUnregisterServer() from the same client. You may also use Registrar object for the same purpose or use direct manipulation of the windows registry.
What is VARIANT? Why and where would you use it? VARIANT is a huge union containing automation type. This allows easy conversion of one automation type to another. The biggest disadvantage of VARIANT is size of the union.
How can you guarantee that only remote server is ever created by a client? Create an object (call CoCreateObjectEx()) with CLSCTX_REMOTE_SERVER flag.
What is __declspec(novtable)? Why would you need this? __declspec(novtable) is a Microsoft’s compiler optimization. The main idea of this optimization is to strip the vtable initialization code from abstract class (for abstract class the vtable is empty, while it is initialized in contructor) MSDN has an article on this topic.
What is an IDL? IDL stands for Interface Definition Language. IDL is the language to describe COM interfaces.
What is In-proc? In-proc is in-process COM object, i.e. COM object that implemented as DLL and supposed to be hosted by a container. When you have to instantiate the in-proc object remotely, you may use DLLHost.exe application that was design specially for this purpose.
What is OLE? OLE is an object and embedding first implementation of COM spec available from MS before COM was officially named COM.
Give examples of OLE usage. The most famous examples are probably drag and drop and structured storage implementations.
What are 2 storage types for composite document? Storage and Stream.
Is .doc document a compound document? Is it a structured storage? Compound document is a document that contains information about other documents hosted in this document. All office documents _may_ be compound documents, but may be not. Word documents from version 6.0 and up are stored as structured storage.
ATL Com Error handling
由于COM是与语言的无关性，COM提供了返回码HRESULT,HRESULT定义为一个简单的长整形，即一个32位的值，它由3部分组成：用途、严重性、状态码.Microsoft提供了一些有用的宏定义来辅助操作，其中MAKE_HRESULT是用来将用途、严重性、状态码组合成HRRESULT值，其参数请参考MSDN。但由于在高级语句中HRESULT不能很多地工作，因为这些语句将HRESULT值隐藏起来，在这些语句中，错误是作为异常而不是返回码来进行处理。当在这些语句中进行编程时，它提供的错误信息太少了，为了克服这一陷，COM提供了IErrorInfo,ICreateError和ISupportErrorInof接口，在ATL Object Wizard的属性中Attribute的Support ISupportErrorInfo打钩就可以了，然后在你的程序中需要返回错误信息的地方加入以下代码：
// return Error(A2W((char*)szErrorText),IID_IEncrypt,::GetLastError());