Tuesday, July 13, 2004

 

Some C++ interview questions - 3

I.
Binary Tree Problems

By Nick Parlante
From http://cslibrary.stanford.edu/110/BinaryTrees.html

Here are 14 binary tree problems in increasing order of difficulty. Some of the problems operate on binary search trees (aka "ordered binary trees") while others work on plain binary trees with no special ordering. The next section, Section 3, shows the solution code in C/C++. Section 4 gives the background and solution code in Java. The basic structure and recursion of the solution code is the same in both languages -- the differences are superficial.
Reading about a data structure is a fine introduction, but at some point the only way to learn is to actually try to solve some problems starting with a blank sheet of paper. To get the most out of these problems, you should at least attempt to solve them before looking at the solution. Even if your solution is not quite right, you will be building up the right skills. With any pointer-based code, it's a good idea to make memory drawings of a a few simple cases to see how the algorithm should work.

1. build123()
This is a very basic problem with a little pointer manipulation. (You can skip this problem if you are already comfortable with pointers.) Write code that builds the following little 1-2-3 binary search tree...
2
/ \
1 3

Write the code in three different ways...

a: by calling newNode() three times, and using three pointer variables
b: by calling newNode() three times, and using only one pointer variable
c: by calling insert() three times passing it the root pointer to build up the tree
(In Java, write a build123() method that operates on the receiver to change it to be the 1-2-3 tree with the given coding constraints. See Section 4.)
struct node* build123() {


2. size()
This problem demonstrates simple binary tree traversal. Given a binary tree, count the number of nodes in the tree.
int size(struct node* node) {


3. maxDepth()
Given a binary tree, compute its "maxDepth" -- the number of nodes along the longest path from the root node down to the farthest leaf node. The maxDepth of the empty tree is 0, the maxDepth of the tree on the first page is 3.
int maxDepth(struct node* node) {


4. minValue()
Given a non-empty binary search tree (an ordered binary tree), return the minimum data value found in that tree. Note that it is not necessary to search the entire tree. A maxValue() function is structurally very similar to this function. This can be solved with recursion or with a simple while loop.
int minValue(struct node* node) {


5. printTree()
Given a binary search tree (aka an "ordered binary tree"), iterate over the nodes to print them out in increasing order. So the tree...
4
/ \
2 5
/ \
1 3

Produces the output "1 2 3 4 5". This is known as an "inorder" traversal of the tree.

Hint: For each node, the strategy is: recur left, print the node data, recur right.

void printTree(struct node* node) {


6. printPostorder()
Given a binary tree, print out the nodes of the tree according to a bottom-up "postorder" traversal -- both subtrees of a node are printed out completely before the node itself is printed, and each left subtree is printed before the right subtree. So the tree...
4
/ \
2 5
/ \
1 3

Produces the output "1 3 2 5 4". The description is complex, but the code is simple. This is the sort of bottom-up traversal that would be used, for example, to evaluate an expression tree where a node is an operation like '+' and its subtrees are, recursively, the two subexpressions for the '+'.

void printPostorder(struct node* node) {


7. hasPathSum()
We'll define a "root-to-leaf path" to be a sequence of nodes in a tree starting with the root node and proceeding downward to a leaf (a node with no children). We'll say that an empty tree contains no root-to-leaf paths. So for example, the following tree has exactly four root-to-leaf paths:
5
/ \
4 8
/ / \
11 13 4
/ \ \
7 2 1

Root-to-leaf paths:
path 1: 5 4 11 7
path 2: 5 4 11 2
path 3: 5 8 13
path 4: 5 8 4 1

For this problem, we will be concerned with the sum of the values of such a path -- for example, the sum of the values on the 5-4-11-7 path is 5 + 4 + 11 + 7 = 27.

Given a binary tree and a sum, return true if the tree has a root-to-leaf path such that adding up all the values along the path equals the given sum. Return false if no such path can be found. (Thanks to Owen Astrachan for suggesting this problem.)

int hasPathSum(struct node* node, int sum) {


8. printPaths()
Given a binary tree, print out all of its root-to-leaf paths as defined above. This problem is a little harder than it looks, since the "path so far" needs to be communicated between the recursive calls. Hint: In C, C++, and Java, probably the best solution is to create a recursive helper function printPathsRecur(node, int path[], int pathLen), where the path array communicates the sequence of nodes that led up to the current call. Alternately, the problem may be solved bottom-up, with each node returning its list of paths. This strategy works quite nicely in Lisp, since it can exploit the built in list and mapping primitives. (Thanks to Matthias Felleisen for suggesting this problem.)
Given a binary tree, print out all of its root-to-leaf paths, one per line.

void printPaths(struct node* node) {


9. mirror()
Change a tree so that the roles of the left and right pointers are swapped at every node.
So the tree...
4
/ \
2 5
/ \
1 3

is changed to...
4
/ \
5 2
/ \
3 1

The solution is short, but very recursive. As it happens, this can be accomplished without changing the root node pointer, so the return-the-new-root construct is not necessary. Alternately, if you do not want to change the tree nodes, you may construct and return a new mirror tree based on the original tree.

void mirror(struct node* node) {


10. doubleTree()
For each node in a binary search tree, create a new duplicate node, and insert the duplicate as the left child of the original node. The resulting tree should still be a binary search tree.
So the tree...
2
/ \
1 3

is changed to...
2
/ \
2 3
/ /
1 3
/
1

As with the previous problem, this can be accomplished without changing the root node pointer.

void doubleTree(struct node* node) {


11. sameTree()
Given two binary trees, return true if they are structurally identical -- they are made of nodes with the same values arranged in the same way. (Thanks to Julie Zelenski for suggesting this problem.)
int sameTree(struct node* a, struct node* b) {


12. countTrees()
This is not a binary tree programming problem in the ordinary sense -- it's more of a math/combinatorics recursion problem that happens to use binary trees. (Thanks to Jerry Cain for suggesting this problem.)
Suppose you are building an N node binary search tree with the values 1..N. How many structurally different binary search trees are there that store those values? Write a recursive function that, given the number of distinct values, computes the number of structurally unique binary search trees that store those values. For example, countTrees(4) should return 14, since there are 14 structurally unique binary search trees that store 1, 2, 3, and 4. The base case is easy, and the recursion is short but dense. Your code should not construct any actual trees; it's just a counting problem.

int countTrees(int numKeys) {



Binary Search Tree Checking (for problems 13 and 14)
This background is used by the next two problems: Given a plain binary tree, examine the tree to determine if it meets the requirement to be a binary search tree. To be a binary search tree, for every node, all of the nodes in its left tree must be <= the node, and all of the nodes in its right subtree must be > the node. Consider the following four examples...
a. 5 -> TRUE
/ \
2 7


b. 5 -> FALSE, because the 6 is not ok to the left of the 5
/ \
6 7


c. 5 -> TRUE
/ \
2 7
/
1

d. 5 -> FALSE, the 6 is ok with the 2, but the 6 is not ok with the 5
/ \
2 7
/ \
1 6

For the first two cases, the right answer can be seen just by comparing each node to the two nodes immediately below it. However, the fourth case shows how checking the BST quality may depend on nodes which are several layers apart -- the 5 and the 6 in that case.


13 isBST() -- version 1
Suppose you have helper functions minValue() and maxValue() that return the min or max int value from a non-empty tree (see problem 3 above). Write an isBST() function that returns true if a tree is a binary search tree and false otherwise. Use the helper functions, and don't forget to check every node in the tree. It's ok if your solution is not very efficient. (Thanks to Owen Astrachan for the idea of having this problem, and comparing it to problem 14)
Returns true if a binary tree is a binary search tree.

int isBST(struct node* node) {


14. isBST() -- version 2
Version 1 above runs slowly since it traverses over some parts of the tree many times. A better solution looks at each node only once. The trick is to write a utility helper function isBSTRecur(struct node* node, int min, int max) that traverses down the tree keeping track of the narrowing min and max allowed values as it goes, looking at each node only once. The initial values for min and max should be INT_MIN and INT_MAX -- they narrow from there.
/*
Returns true if the given tree is a binary search tree
(efficient version).
*/
int isBST2(struct node* node) {
return(isBSTRecur(node, INT_MIN, INT_MAX));
}

/*
Returns true if the given tree is a BST and its
values are >= min and <= max.
*/
int isBSTRecur(struct node* node, int min, int max) {


15. Tree-List
The Tree-List problem is one of the greatest recursive pointer problems ever devised, and it happens to use binary trees as well. CLibarary #109 http://cslibrary.stanford.edu/109/ works through the Tree-List problem in detail and includes solution code in C and Java. The problem requires an understanding of binary trees, linked lists, recursion, and pointers. It's a great problem, but it's complex.


Section 3 -- C/C++ Solutions
Make an attempt to solve each problem before looking at the solution -- it's the best way to learn.
1. Build123() Solution (C/C++)
// call newNode() three times
struct node* build123a() {
struct node* root = newNode(2);
struct node* lChild = newNode(1);
struct node* rChild = newNode(3);
root->left = lChild;
root->right= rChild;

return(root);
}

// call newNode() three times, and use only one local variable
struct node* build123b() {
struct node* root = newNode(2);
root->left = newNode(1);
root->right = newNode(3);

return(root);
}


/*
Build 123 by calling insert() three times.
Note that the '2' must be inserted first.
*/
struct node* build123c() {
struct node* root = NULL;
root = insert(root, 2);
root = insert(root, 1);
root = insert(root, 3);
return(root);
}


2. size() Solution (C/C++)
/*
Compute the number of nodes in a tree.
*/
int size(struct node* node) {
if (node==NULL) {
return(0);
} else {
return(size(node->left) + 1 + size(node->right));
}
}

3. maxDepth() Solution (C/C++)
/*
Compute the "maxDepth" of a tree -- the number of nodes along
the longest path from the root node down to the farthest leaf node.
*/
int maxDepth(struct node* node) {
if (node==NULL) {
return(0);
}
else {
// compute the depth of each subtree
int lDepth = maxDepth(node->left);
int rDepth = maxDepth(node->right);
// use the larger one
if (lDepth > rDepth) return(lDepth+1);
else return(rDepth+1);
}
}


4. minValue() Solution (C/C++)
/*
Given a non-empty binary search tree,
return the minimum data value found in that tree.
Note that the entire tree does not need to be searched.
*/
int minValue(struct node* node) {
struct node* current = node;
// loop down to find the leftmost leaf
while (current->left != NULL) {
current = current->left;
}

return(current->data);
}


5. printTree() Solution (C/C++)
/*
Given a binary search tree, print out
its data elements in increasing
sorted order.
*/
void printTree(struct node* node) {
if (node == NULL) return;
printTree(node->left);
printf("%d ", node->data);
printTree(node->right);
}


6. printPostorder() Solution (C/C++)
/*
Given a binary tree, print its
nodes according to the "bottom-up"
postorder traversal.
*/
void printPostorder(struct node* node) {
if (node == NULL) return;
// first recur on both subtrees
printTree(node->left);
printTree(node->right);

// then deal with the node
printf("%d ", node->data);
}


7. hasPathSum() Solution (C/C++)
/*
Given a tree and a sum, return true if there is a path from the root
down to a leaf, such that adding up all the values along the path
equals the given sum.
Strategy: subtract the node value from the sum when recurring down,
and check to see if the sum is 0 when you run out of tree.
*/
int hasPathSum(struct node* node, int sum) {
// return true if we run out of tree and sum==0
if (node == NULL) {
return(sum == 0);
}
else {
// otherwise check both subtrees
int subSum = sum - node->data;
return(hasPathSum(node->left, subSum) ||
hasPathSum(node->right, subSum));
}
}


8. printPaths() Solution (C/C++)
/*
Given a binary tree, print out all of its root-to-leaf
paths, one per line. Uses a recursive helper to do the work.
*/
void printPaths(struct node* node) {
int path[1000];
printPathsRecur(node, path, 0);
}

/*
Recursive helper function -- given a node, and an array containing
the path from the root node up to but not including this node,
print out all the root-leaf paths.
*/
void printPathsRecur(struct node* node, int path[], int pathLen) {
if (node==NULL) return;

// append this node to the path array
path[pathLen] = node->data;
pathLen++;

// it's a leaf, so print the path that led to here
if (node->left==NULL && node->right==NULL) {
printArray(path, pathLen);
}
else {
// otherwise try both subtrees
printPathsRecur(node->left, path, pathLen);
printPathsRecur(node->right, path, pathLen);
}
}

// Utility that prints out an array on a line.
void printArray(int ints[], int len) {
int i;
for (i=0; i printf("%d ", ints[i]);
}
printf("\n");
}


9. mirror() Solution (C/C++)
/*
Change a tree so that the roles of the
left and right pointers are swapped at every node.
So the tree...
4
/ \
2 5
/ \
1 3

is changed to...
4
/ \
5 2
/ \
3 1
*/
void mirror(struct node* node) {
if (node==NULL) {
return;
}
else {
struct node* temp;

// do the subtrees
mirror(node->left);
mirror(node->right);

// swap the pointers in this node
temp = node->left;
node->left = node->right;
node->right = temp;
}
}


10. doubleTree() Solution (C/C++)
/*
For each node in a binary search tree,
create a new duplicate node, and insert
the duplicate as the left child of the original node.
The resulting tree should still be a binary search tree.
So the tree...
2
/ \
1 3

Is changed to...
2
/ \
2 3
/ /
1 3
/
1

*/
void doubleTree(struct node* node) {
struct node* oldLeft;

if (node==NULL) return;

// do the subtrees
doubleTree(node->left);
doubleTree(node->right);

// duplicate this node to its left
oldLeft = node->left;
node->left = newNode(node->data);
node->left->left = oldLeft;
}


11. sameTree() Solution (C/C++)
/*
Given two trees, return true if they are
structurally identical.
*/
int sameTree(struct node* a, struct node* b) {
// 1. both empty -> true
if (a==NULL && b==NULL) return(true);
// 2. both non-empty -> compare them
else if (a!=NULL && b!=NULL) {
return(
a->data == b->data &&
sameTree(a->left, b->left) &&
sameTree(a->right, b->right)
);
}
// 3. one empty, one not -> false
else return(false);
}



12. countTrees() Solution (C/C++)
/*
For the key values 1...numKeys, how many structurally unique
binary search trees are possible that store those keys.
Strategy: consider that each value could be the root.
Recursively find the size of the left and right subtrees.
*/
int countTrees(int numKeys) {

if (numKeys <=1) {
return(1);
}
else {
// there will be one value at the root, with whatever remains
// on the left and right each forming their own subtrees.
// Iterate through all the values that could be the root...
int sum = 0;
int left, right, root;

for (root=1; root<=numKeys; root++) {
left = countTrees(root - 1);
right = countTrees(numKeys - root);

// number of possible trees with this root == left*right
sum += left*right;
}

return(sum);
}
}



13. isBST1() Solution (C/C++)
/*
Returns true if a binary tree is a binary search tree.
*/
int isBST(struct node* node) {
if (node==NULL) return(true);
// false if the max of the left is > than us

// (bug -- an earlier version had min/max backwards here)
if (node->left!=NULL && maxValue(node->left) > node->data)
return(false);

// false if the min of the right is <= than us
if (node->right!=NULL && minValue(node->right) <= node->data)
return(false);

// false if, recursively, the left or right is not a BST
if (!isBST(node->left) || !isBST(node->right))
return(false);

// passing all that, it's a BST
return(true);
}



14. isBST2() Solution (C/C++)
/*
Returns true if the given tree is a binary search tree
(efficient version).
*/
int isBST2(struct node* node) {
return(isBSTUtil(node, INT_MIN, INT_MAX));
}
/*
Returns true if the given tree is a BST and its
values are >= min and <= max.
*/
int isBSTUtil(struct node* node, int min, int max) {
if (node==NULL) return(true);

// false if this node violates the min/max constraint
if (node->datadata>max) return(false);

// otherwise check the subtrees recursively,
// tightening the min or max constraint
return
isBSTUtil(node->left, min, node->data) &&
isBSTUtil(node->right, node->data+1, max)
);
}


15. TreeList Solution (C/C++)
The solution code in C and Java to the great Tree-List recursion problem is in CSLibrary #109 http://cslibrary.stanford.edu/109/

II.
The Great Tree-List Recursion Problem

By Nick Parlante
From http://cslibrary.stanford.edu/109/TreeListRecursion.html

Problem Statement
Here's the formal problem statement: Write a recursive function treeToList(Node root) that takes an ordered binary tree and rearranges the internal pointers to make a circular doubly linked list out of the tree nodes. The "previous" pointers should be stored in the "small" field and the "next" pointers should be stored in the "large" field. The list should be arranged so that the nodes are in increasing order. Return the head pointer to the new list. The operation can be done in O(n) time -- essentially operating on each node once. Basically take figure-1 as input and rearrange the pointers to make figure-2.
Try the problem directly, or see the hints below.

Hints

Hint #1
The recursion is key. Trust that the recursive call on each sub-tree works and concentrate on assembling the outputs of the recursive calls to build the result. It's too complex to delve into how each recursive call is going to work -- trust that it did work and assemble the answer from there.

Hint #2
The recursion will go down the tree, recursively changing the small and large sub-trees into lists, and then append those lists together with the parent node to make larger lists. Separate out a utility function append(Node a, Node b) that takes two circular doubly linked lists and appends them together to make one list which is returned. Writing a separate utility function helps move some of the complexity out of the recursive function.

5. Lessons and Solution Code
The solution code is given below in Java and C. The most important method is treeToList() and the helper methods join() and append(). Here are the lessons I see in the two solutions...
Trust that the recursive calls return correct output when fed correct input -- make the leap of faith. Look at the partial results that the recursive calls give you, and construct the full result from them. If you try to step into the recursive calls to think how they are working, you'll go crazy.
Decomposing out well defined helper functions is a good idea. Writing the list-append code separately helps you concentrate on the recursion which is complex enough on its own.

C Solution Code
/*
TreeList.c

C code version of the great Tree-List recursion problem.
See http://cslibrary.stanford.edu/109/ for the full
discussion and the Java solution.

This code is free for any purpose.
Feb 22, 2000
Nick Parlante nick.parlante@cs.stanford.edu
*/


#include
#include
#include

/* The node type from which both the tree and list are built */
struct node {
int data;
struct node* small;
struct node* large;
};
typedef struct node* Node;



/*
helper function -- given two list nodes, join them
together so the second immediately follow the first.
Sets the .next of the first and the .previous of the second.
*/
static void join(Node a, Node b) {
a->large = b;
b->small = a;
}


/*
helper function -- given two circular doubly linked
lists, append them and return the new list.
*/
static Node append(Node a, Node b) {
Node aLast, bLast;

if (a==NULL) return(b);
if (b==NULL) return(a);

aLast = a->small;
bLast = b->small;

join(aLast, b);
join(bLast, a);

return(a);
}


/*
--Recursion--
Given an ordered binary tree, recursively change it into
a circular doubly linked list which is returned.
*/
static Node treeToList(Node root) {
Node aList, bList;

if (root==NULL) return(NULL);

/* recursively solve subtrees -- leap of faith! */
aList = treeToList(root->small);
bList = treeToList(root->large);

/* Make a length-1 list ouf of the root */
root->small = root;
root->large = root;

/* Append everything together in sorted order */
aList = append(aList, root);
aList = append(aList, bList);

return(aList);



/* Create a new node */
static Node newNode(int data) {
Node node = (Node) malloc(sizeof(struct node));
node->data = data;
node->small = NULL;
node->large = NULL;
return(node);
}


/* Add a new node into a tree */
static void treeInsert(Node* rootRef, int data) {
Node root = *rootRef;
if (root == NULL) *rootRef = newNode(data);
else {
if (data <= root->data) treeInsert(&(root->small), data);
else treeInsert(&(root->large), data);
}
}


static void printList(Node head) {
Node current = head;

while(current != NULL) {
printf("%d ", current->data);
current = current->large;
if (current == head) break;
}
printf("\n");
}


/* Demo that the code works */
int main() {
Node root = NULL;
Node head;

treeInsert(&root, 4);
treeInsert(&root, 2);
treeInsert(&root, 1);
treeInsert(&root, 3);
treeInsert(&root, 5);

head = treeToList(root);

printList(head); /* prints: 1 2 3 4 5 */

return(0);
}

III.
发信人: goodguys (goodguys), 信区: JobHunting
标 题: Re: Interview Questions from Goldman Sachs
发信站: Unknown Space - 未名空间 (Fri Feb 25 15:19:14 2005)

Is this group inside the FICC? One of my friend got this question at the on
campus interview with Gloldman FICC.

Suppose that there is an array a[1,2,...,n-k] with size n-k. Every entry in
this array is an integer from 1 to n, with no integer appears twice in the
array.

Clearly there are k integers missing in this arary.

Question: write a subroutine to print out this k integers. Your program should
run in no more than O(n) time and O(k) space.

If you give the right algorithm on spot, you will be asked to write down the
algorithm in solid C/C++ code.

In case you can not solve it on spot, you can email the answer to the
interviewer in one day!

发信人: yjhsjtu (xx), 信区: JobHunting
标 题: Re: Interview Questions from Goldman Sachs
发信站: Unknown Space - 未名空间 (Fri Feb 25 17:04:38 2005)

haha! except reading, you use writing to the original array.
if so, there is an easier method to resolve it.

for i = n-k+1: n
a(i)= 1;
end
for i=1: n-k
t = abs(a(i));
a(t) = - abs(a(t))
end

all the numbers not in the original array will be marked as positive
finally, just do

for i=1: n-k
a(i) = abs(a(i));
end

the original array will not be affected, but used.
from the point of algorithm, no problem.
from the point of programming, sometimes it does not work if the input array
is const array and you are required to use o(k) space.

IV.
发信人: goodguys (goodguys), 信区: JobHunting
标 题: Another MS Coding Interview Question
发信站: Unknown Space - 未名空间 (Fri Feb 25 17:43:57 2005)

Good work everyone on the GS interview question. Now let us try another one
from MS:

Given an array of length N containing integers between 1 and N-1. Wirte a
subroutine to output one of the duplicate elements. Be careful, it is possible
that some elements may be missing, and there may have multiple duplicate.

(Is there an O(n) time solution that uses only O(1) extra space and does not
destroy the original array?)

--
来源:.Unknown Space - 未名空间 mitbbs.com.[FROM: 131.215.]

发信人: fengeer (千山万水), 信区: JobHunting
标 题: Re: Another MS Coding Interview Question
发信站: Unknown Space - 未名空间 (Fri Feb 25 17:49:51 2005)

yjhsjtu 's solution can directly be used here if the original array is not a
constant array, right?

V.
内存对齐问题

From http://dev.csdn.net/develop/article/48/48057.shtm

typedef struct
{
UINT32 NumElements;
union
{
UINT32 ObjectHandle;
}Entry;
}STR_ARRAY, *PSTR_ARRAY;

还有这两句#pragma pack(push, 1)
#pragma pack(pop)

#pragma pack( [ n ] )
该指令指定结构和联合成员的紧凑对齐。而一个完整的转换单元的结构和联合
的紧凑对齐由/ Z p 选项设置。紧凑对齐用p a c e 编译指示在数据说明层设置。该
编译指示在其出现后的第一个结构或联合说明处生效。该编译指示对定义无效。
当你使用#pragma pack ( n ) 时, 这里n 为1 、2 、4 、8 或1 6 。第一个结构成员之
后的每个结构成员都被存储在更小的成员类型或n 字节界限内。如果你使用无
参量的#pragma pack , 结构成员被紧凑为以/ Z p 指定的值。该缺省/ Z p 紧凑值为
/ Z p 8 。
编译器也支持以下增强型语法:
#pragma pack( [ [ { p u s h | p o p } , ] [ 标识符, ] ] [ n] )
若不同的组件使用p a c k 编译指示指定不同的紧凑对齐, 这个语法允许你把程序
组件组合为一个单独的转换单元。
带p u s h 参量的p a c k 编译指示的每次出现将当前的紧凑对齐存储到一个内部编
译器堆栈中。编译指示的参量表从左到右读取。如果你使用p u s h , 则当前紧凑
值被存储起来; 如果你给出一个n 的值, 该值将成为新的紧凑值。若你指定一个
标识符, 即你选定一个名称, 则该标识符将和这个新的的紧凑值联系起来。
带一个p o p 参量的p a c k 编译指示的每次出现都会检索内部编译器堆栈顶的值,
并且使该值为新的紧凑对齐值。如果你使用p o p 参量且内部编译器堆栈是空的,
则紧凑值为命令行给定的值, 并且将产生一个警告信息。若你使用p o p 且指定一
个n 的值, 该值将成为新的紧凑值。
若你使用p o p 且指定一个标识符,  所有存储在堆栈中的值将从栈中删除, 直到找
到一个匹配的标识符, 这个与标识符相关的紧凑值也从栈中移出, 并且这个仅在
标识符入栈之前存在的紧凑值成为新的紧凑值。如果未找到匹配的标识符, 将使
用命令行设置的紧凑值, 并且将产生一个一级警告。缺省紧凑对齐为8 。
p a c k 编译指示的新的增强功能让你编写头文件, 确保在遇到该头文件的前后的
紧凑值是一样的。

什么是内存对齐

考虑下面的结构:

struct foo
{
char c1;
short s;
char c2;
int i;
};

假设这个结构的成员在内存中是紧凑排列的,假设c1的地址是0,那么s的地址就应该是1,c2的地址就是3,i的地址就是4。也就是
c1 00000000, s 00000001, c2 00000003, i 00000004。

可是,我们在Visual c/c++ 6中写一个简单的程序:

struct foo a;
printf("c1 %p, s %p, c2 %p, i %p\n",
(unsigned int)(void*)&a.c1 - (unsigned int)(void*)&a,
(unsigned int)(void*)&a.s - (unsigned int)(void*)&a,
(unsigned int)(void*)&a.c2 - (unsigned int)(void*)&a,
(unsigned int)(void*)&a.i - (unsigned int)(void*)&a);
运行,输出:
c1 00000000, s 00000002, c2 00000004, i 00000008。

为什么会这样?这就是内存对齐而导致的问题。

为什么会有内存对齐

以下内容节选自《Intel Architecture 32 Manual》。
字,双字,和四字在自然边界上不需要在内存中对齐。(对字,双字,和四字来说,自然边界分别是偶数地址,可以被4整除的地址,和可以被8整除的地址。)
无论如何,为了提高程序的性能,数据结构(尤其是栈)应该尽可能地在自然边界上对齐。原因在于,为了访问未对齐的内存,处理器需要作两次内存访问;然而,对齐的内存访问仅需要一次访问。
一个字或双字操作数跨越了4字节边界,或者一个四字操作数跨越了8字节边界,被认为是未对齐的,从而需要两次总线周期来访问内存。一个字起始地址是奇数但却没有跨越字边界被认为是对齐的,能够在一个总线周期中被访问。
某些操作双四字的指令需要内存操作数在自然边界上对齐。如果操作数没有对齐,这些指令将会产生一个通用保护异常(#GP)。双四字的自然边界是能够被16整除的地址。其他的操作双四字的指令允许未对齐的访问(不会产生通用保护异常),然而,需要额外的内存总线周期来访问内存中未对齐的数据。

编译器对内存对齐的处理

缺省情况下,c/c++编译器默认将结构、栈中的成员数据进行内存对齐。因此,上面的程序输出就变成了:
c1 00000000, s 00000002, c2 00000004, i 00000008。
编译器将未对齐的成员向后移,将每一个都成员对齐到自然边界上,从而也导致了整个结构的尺寸变大。尽管会牺牲一点空间(成员之间有空洞),但提高了性能。
也正是这个原因,我们不可以断言sizeof(foo) == 8。在这个例子中,sizeof(foo) == 12。

如何避免内存对齐的影响

那么,能不能既达到提高性能的目的,又能节约一点空间呢?有一点小技巧可以使用。比如我们可以将上面的结构改成:

struct bar
{
char c1;
char c2;
short s;
int i;
};
这样一来,每个成员都对齐在其自然边界上,从而避免了编译器自动对齐。在这个例子中,sizeof(bar) == 8。

这个技巧有一个重要的作用,尤其是这个结构作为API的一部分提供给第三方开发使用的时候。第三方开发者可能将编译器的默认对齐选项改变,从而造成这个结构在你的发行的DLL中使用某种对齐方式,而在第三方开发者哪里却使用另外一种对齐方式。这将会导致重大问题。
比如,foo结构,我们的DLL使用默认对齐选项,对齐为
c1 00000000, s 00000002, c2 00000004, i 00000008,同时sizeof(foo) == 12。
而第三方将对齐选项关闭,导致
c1 00000000, s 00000001, c2 00000003, i 00000004,同时sizeof(foo) == 8。

如何使用c/c++中的对齐选项

vc6中的编译选项有 /Zp[1|2|4|8|16] ,/Zp1表示以1字节边界对齐,相应的,/Zpn表示以n字节边界对齐。n字节边界对齐的意思是说,一个成员的地址必须安排在成员的尺寸的整数倍地址上或者是n的整数倍地址上,取它们中的最小值。也就是:
min ( sizeof ( member ), n)
实际上,1字节边界对齐也就表示了结构成员之间没有空洞。
/Zpn选项是应用于整个工程的,影响所有的参与编译的结构。
要使用这个选项,可以在vc6中打开工程属性页,c/c++页,选择Code Generation分类,在Struct member alignment可以选择。

要专门针对某些结构定义使用对齐选项,可以使用#pragma pack编译指令。指令语法如下:
#pragma pack( [ show ] | [ push | pop ] [, identifier ] , n )
意义和/Zpn选项相同。比如:

#pragma pack(1)
struct foo_pack
{
char c1;
short s;
char c2;
int i;
};
#pragma pack()

栈内存对齐

我们可以观察到,在vc6中栈的对齐方式不受结构成员对齐选项的影响。(本来就是两码事)。它总是保持对齐,而且对齐在4字节边界上。

验证代码

#include

struct foo
{
char c1;
short s;
char c2;
int i;
};

struct bar
{
char c1;
char c2;
short s;
int i;
};

#pragma pack(1)
struct foo_pack
{
char c1;
short s;
char c2;
int i;
};
#pragma pack()


int main(int argc, char* argv[])
{
char c1;
short s;
char c2;
int i;

struct foo a;
struct bar b;
struct foo_pack p;

printf("stack c1 %p, s %p, c2 %p, i %p\n",
(unsigned int)(void*)&c1 - (unsigned int)(void*)&i,
(unsigned int)(void*)&s - (unsigned int)(void*)&i,
(unsigned int)(void*)&c2 - (unsigned int)(void*)&i,
(unsigned int)(void*)&i - (unsigned int)(void*)&i);

printf("struct foo c1 %p, s %p, c2 %p, i %p\n",
(unsigned int)(void*)&a.c1 - (unsigned int)(void*)&a,
(unsigned int)(void*)&a.s - (unsigned int)(void*)&a,
(unsigned int)(void*)&a.c2 - (unsigned int)(void*)&a,
(unsigned int)(void*)&a.i - (unsigned int)(void*)&a);

printf("struct bar c1 %p, c2 %p, s %p, i %p\n",
(unsigned int)(void*)&b.c1 - (unsigned int)(void*)&b,
(unsigned int)(void*)&b.c2 - (unsigned int)(void*)&b,
(unsigned int)(void*)&b.s - (unsigned int)(void*)&b,
(unsigned int)(void*)&b.i - (unsigned int)(void*)&b);

printf("struct foo_pack c1 %p, s %p, c2 %p, i %p\n",
(unsigned int)(void*)&p.c1 - (unsigned int)(void*)&p,
(unsigned int)(void*)&p.s - (unsigned int)(void*)&p,
(unsigned int)(void*)&p.c2 - (unsigned int)(void*)&p,
(unsigned int)(void*)&p.i - (unsigned int)(void*)&p);

printf("sizeof foo is %d\n", sizeof(foo));
printf("sizeof bar is %d\n", sizeof(bar));
printf("sizeof foo_pack is %d\n", sizeof(foo_pack));

return 0;
}

VI.
From http://community.csdn.net/Expert/topic/3804/3804035.xml?temp=.6248438

#pragma pack(8)

struct s1{
short a;
long b;
};

struct s2{
char c;
s1 d;
long long e;
};

#pragma pack()


1.sizeof(s2) = ?
2.s2的s1中的a后面空了几个字节接着是b?

回复人: redleaves(ID最吊的网友) ( ) 信誉:105 2005-02-26 17:25:00

还是我给出正确答案吧:
如果代码:
#pragma pack(8)
struct S1{
char a;
long b;
};
struct S2 {
char c;
struct S1 d;
long long e;
};
#pragma pack()
sizeof(S2)结果为24.
成员对齐有一个重要的条件,即每个成员分别对齐.即每个成员按自己的方式对齐.
也就是说上面虽然指定了按8字节对齐,但并不是所有的成员都是以8字节对齐.其对齐的规则是,每个成员按其类型的对齐参数(通常是这个类型的大小)和指定对齐参数(这里是8字节)中较小的一个对齐.并且结构的长度必须为所用过的所有对齐参数的整数倍,不够就补空字节.
S1中,成员a是1字节默认按1字节对齐,指定对齐参数为8,这两个值中取1,a按1字节对齐;成员b是4个字节,默认是按4字节对齐,这时就按4字节对齐,所以sizeof(S1)应该为8;
S2中,c和S1中的a一样,按1字节对齐,而d 是个结构,它是8个字节,它按什么对齐呢?对于结构来说,它的默认对齐方式就是它的所有成员使用的对齐参数中最大的一个,S1的就是4.所以,成员d就是按4字节对齐.成员e是8个字节,它是默认按8字节对齐,和指定的一样,所以它对到8字节的边界上,这时,已经使用了12个字节了,所以又添加了4个字节的空,从第16个字节开始放置成员e.这时,长度为24,已经可以被8(成员e按8字节对齐)整除.这样,一共使用了24个字节.
a b
S1的内存布局:11**,1111,
c S1.a S1.b d
S2的内存布局:1***,11**,1111,****11111111

这里有三点很重要:
1.每个成员分别按自己的方式对齐,并能最小化长度
2.复杂类型(如结构)的默认对齐方式是它最长的成员的对齐方式,这样在成员是复杂类型时,可以最小化长度
3.对齐后的长度必须是成员中最大的对齐参数的整数倍,这样在处理数组时可以保证每一项都边界对齐



<< Home

This page is powered by Blogger. Isn't yours?