Friday, July 09, 2004

 

Some interview questions from "Programming Pearls"

1. Vector Rotation
The Problem
Rotate vectorx[n] left bydpositions.
For n=8 andd=3,
changeabcdefghtodefghabc.

Constraints:O(n) time,O( 1 ) extra space.
Pricey Solutions
Storedin intermediate vector, shift the rest, store back. [O(n) extra space.]
Rotate by 1dtimes. [O(n) time.]

A Juggling Algorithm
The Code
for i = [0, gcd(d, n))
/* move i-th values of blocks */
t = x[i]
j = i
loop
k = j + d
if k >= n
k -= n
if k == i
break
x[j] = x[k]
j = k
x[j] = t

The Block-Swap Algorithm
The Idea: Change abtoba
If a is shorter, divide b into bl and br.
Swap a and br to change a bl br into br bl a.
Recur on pieces of b.
The Code

if d == 0 || d == n
return
i = p = d
j = n - p
while i != j
if i > j
swap(p-i, p, j)
i -= j
else
swap(p-i, p+j-i, i)
j -= i
swap(p-i, p, i)

The Reversal Algorithm

The Code /* rotateabcdefghleft three */
reverse(0, d-1) /* cbadefgh */
reverse(d, n-1) /* cbahgfed */
reverse(0, n-1) /* defghabc */

Best is the Block-Swap Algorithm, second is The Reversal Algorithm.

2. Anagrams

Definition.Two words are anagrams if one can be formed by permuting the letters in the other.

A 6-element anagram set:
opts pots post stop spot tops

The Problem.How would you compute all anagram sets in a dictionary of 230,000 English words? (Related problem: how to find all anagrams of an input word?)

Two Slow Algorithms
Examine All Permutations. ~~1.1*10^9 seconds
Examine All Pairs of Words. ~~14.7 hours

A Fast Algorithm
The Idea.Sign words so that those in the same anagram class have the same signature, and then collect equal signatures.
The Signature.Sort letters within a word. The signature of deposit is deiopst, which is also the signature of dopiest.
Collecting the Classes.Sort the words by their signatures.

signin C
int charcomp(char *x, char *y)
{ return(*x - *y); }
#define WORDMAX 100
int main(void)
{ char word[WORDMAX], sig[WORDMAX];
while (scanf("%s", word) != EOF) {
strcpy(sig, word);
qsort(sig, strlen(sig),
sizeof(char), charcomp);
printf("%s %s\n", sig, word);
}
return 0;
}

squashin C
int main(void)
{ char word[MAX], sig[MAX], oldsig[MAX];
int linenum = 0;
strcpy(oldsig, "");
while (scanf("%s %s", sig, word) != EOF)
if (strcmp(oldsig, sig) != 0
&& linenum > 0)
printf("\n");
strcpy(oldsig, sig);
linenum++;
printf("%s ", word);
}
printf("\n");
return 0;
}

The Complete Program
Executed by
sign grams
where dictc ontains 230,000 words.
Total run time is 18 seconds: 4 insign, 11 insort and 3 insquash.

Some Output:
subessential suitableness
canter creant cretan nectar recant tanrec trance
caret carte cater crate creat creta react recta trace
destain instead sainted satined
adroitly dilatory idolatry
least setal slate stale steal stela tales
reins resin rinse risen serin siren
constitutionalism misconstitutional

3.
The Problem Definition. Given the real vectorx[n], compute the maximum sum found in any contiguous subvector.

A Cubic Algorithm
Idea.For all pairs of integersiandjsatisfying 0<=i<=jCode.
maxsofar = 0
for i = [0, n)
for j = [i, n)
sum = 0
for k = [i, j]
sum += x[k]
/* sum is sum of x[i..j] */
maxsofar = max(maxsofar, sum)

Run Time. O(n).

A Quadratic Algorithm
Idea.The sum ofx[i..j] is close to the previous sum,x[i..j-1 ].
Code.
maxsofar = 0
for i = [0, n)
sum = 0
for j = [i, n)
sum += x[j]
/* sum is sum of x[i..j] */
maxsofar = max(maxsofar, sum)

Run Time. O(n).

Another Quadratic Algorithm
Idea.A cumulative array allows sums to be computed quickly. Ifytd[i] contains year-to-date sales through monthi, then sales from March through September are given byytd[sep]-ytd[feb].
Implementation.Use the cumulative arraycumarr.
. . .
Initializecumarr[i]=x[ 0 ]++x[i]. The sum of the values inx[i..j] iscumarr[j]-cumarr[i-1 ].

Code for Algorithm 2b.
cumarr[-1] = 0
for i = [0, n)
cumarr[i] = cumarr[i-1] + x[i]
maxsofar = 0
for i = [0, n)
for j = [i, n)
sum = cumarr[j] - cumarr[i-1]
/* sum is sum of x[i..j] */
maxsofar = max(maxsofar, sum)

Run Time. O(n).

AnO(nlogn) Algorithm
The Divide-and-Conquer Schema.To solve a problem of sizen, recursively solve two subproblems of sizen /2 and combine their solutions.
The Idea.Divide into two subproblems.

Recursively find maximum in subvectors.
Find maximum crossing subvector.
Return max ofma,mbandmc.
Run Time. O(nlogn).

Code for theO(NlogN) Algorithm
float maxsum3(l, u)
if (l > u) /* zero elements */
return 0
if (l == u) /* one element */
return max(0, x[l])
m = (l + u) / 2
/* find max crossing to left */
lmax = sum = 0
for (i = m; i >= l; i--)
sum += x[i]
lmax = max(lmax, sum)
/* find max crossing to right */
rmax = sum = 0
for i = (m, u]
sum += x[i]
rmax = max(rmax, sum)
return max(lmax+rmax,
maxsum3(l, m),
maxsum3(m+1, u))

A Linear Algorithm
Idea.How can we extend a solution forx[ 0..i-1 ]
into a solution forx[ 0..i]? Key variables:
Code.
maxsofar = 0
maxhere = 0
for i = [0, n)
/* invariant: maxhere and maxsofar
are accurate for x[0..i-1] */
maxhere = max(maxhere + x[i], 0)
maxsofar = max(maxsofar, maxhere)
Run Time. O(n).

4.
Longest Duplicated String
The Problem
Input: Ask not what your country can do for you, but what you can do for your country.
Output: can do for you (15 chars)

A Simple Algorithm (at least quadratic)
maxlen = -1
for i = [0, n)
for j = (i, n)
if (l=comlen(&c[i], &c[j])) > maxlen
{
maxlen = l
maxi = i
maxj = j
}

An Important Function
int comlen(char *p, char *q)
i = 0
while *p && (*p++ == *q++)
i++
return i

A Fast Algorithm
Key Data Structures
#define MAXN 5000000
char c[MAXN], *a[MAXN];
Input Suffix Array for banana
a[0]: banana
a[1]: anana
a[2]: nana
a[3]: ana
a[4]: na
a[5]: a

Scan through to find longest duplicated string (ana).

The Suffix Array Code
The Code (usually aboutO(nlogn))
while (ch = getchar()) != EOF
{
a[n] = &c[n]
c[n++] = ch
}
c[n] = 0
qsort(a, n, sizeof(char *), pstrcmp)
for i = [0, n)
if comlen(a[i], a[i+1]) > maxlen
{
maxlen = comlen(a[i], a[i+1])
maxi = i
}
printf("%.*s\n", maxlen, a[maxi])

4.8 secs on 807,503 characters of Homer's Iliad:
whose sake so many of the Achaeans have died at Troy, far from their homes? Go about at once among the host, and speak fairly to them, man by man, that they draw not their ships into the sea.

The Complete Code
#include
#include
#include
char inputchars[4300000];
char *word[800000];
int nword = 0;
int k = 2;
int wordncmp(char *p, char* q)
{ int n = k;
for ( ; *p == *q; p++, q++)
if (*p == 0 && --n == 0)
return 0;
return *p - *q;
}
int sortcmp(char **p, char **q)
{ return wordncmp(*p, *q);
}
char *skip(char *p, int n)
{ for ( ; n > 0; p++)
if (*p == 0)
n--;
return p;
}
int main()
{ int i, wordsleft = 10000, l, m, u;
char *phrase, *p;
word[0] = inputchars;
while (scanf("%s", word[nword]) != EOF) {
word[nword+1] = word[nword] + strlen(word[nword]) + 1;
nword++;
}
for (i = 0; i < k; i++)
word[nword][i] = 0;
for (i = 0; i < k; i++)
printf("%s0, word[i]);
qsort(word, nword, sizeof(word[0]), sortcmp);
phrase = inputchars;
for ( ; wordsleft > 0; wordsleft--) {
l = -1;
u = nword;
while (l+1 != u) {
m = (l + u) / 2;
if (wordncmp(word[m], phrase) < 0)
l = m;
else
u = m;
}
for (i = 0; wordncmp(phrase, word[u+i]) == 0; i++)
if (rand() % (i+1) == 0)
p = word[u+i];
phrase = skip(p, 1);
if (strlen(skip(phrase, k-1)) == 0)
break;
printf("%s0, skip(phrase, k-1));
}
return 0;
}



<< Home

This page is powered by Blogger. Isn't yours?