Thursday, April 28, 2011

Algorithmic Complexity

Hi all! Today i'll tell you something about algorithm complexity

Without going into details, complexity of algorithm can be described as the behavior of run-time depending on the data sizes, for example
$T(n) = O(n)$
or
$T(n) = \Theta(n)$
or
$T(n) = \Omega(n)$
here $T(n)$ - run-time, $n$ - size of data, $O$, $\Theta$, $\Omega$ - asymptotic complexity.
$O$ - maximum complexity (at worth),
$\Theta$ - average complexity (usually),
$\Omega$ - minimum complexity (at best).

Let me show you an example


#include <cstdio>
int n;
int main(){

     scanf("%d", &n);
     printf("%d", 2 * n * n - n);

     return 0;

}


Complexity of this algorithm is $T(n) = O(1)$. When working with asymptotic complexity we do not pay attention to the constant multiplier. So, if we write:



#include <cstdio>
int n;
int main(){

     scanf("%d", &n);
     printf("%d", 2 * n * n - n);
     printf("%d", 3 * n * n * n - n + 2);
     printf("%d", 7 * n * n - 1);

     return 0;

}


Our complexity is still $O(1)$
Lets say we need to read N numbers and calculate their sum, so our code would be something like:




#include <cstdio>
#define maxn 10100
int n, A[maxn], sum;
int main(){

     sum = 0;

     scanf("%d", &n);
     for (int i = 0; i < n; ++i) scanf("%d", A + i);

     for (int i = 0; i < n; ++i) sum += A[i];
     printf("%d\n", sum);

     return 0;

}


Here our code will do "scanf" n times and "+" n times too, so our complexity is $T(n) = O(n)$
This code uses additional memory, so we have an additional complexity - memory complexity. For our sample it is $M(n) = O(n)$
Lets do this task a bit more effectively:




#include <cstdio>
int n, x, sum;
int main(){

     sum = 0;

     scanf("%d", &n);
     for (int i = 0; i < n; ++i){ 

             scanf("%d", &x);
             sum += x;

     }
     printf("%d\n", sum);

     return 0;

}



So, here our run-time complexity haven't changed (constant multiplier is reduced by about two times), but we aren't using now n elements of memory, so:
$T(n) = O(n)$
$M(n) = O(1)$
Want something a bit harder? Now we need to sort this data, it could be coded with stl function sort or quicksort, both would be $T(n) = \Theta(n log n)$, but lets use this one:



#include <cstdio>
#include <algorithm>

#define maxn 10100
int n, A[maxn];

void shellsort(int * A, int n){

     int incr, i, j;
     for (incr = n / 2; incr > 0; incr /= 2)
         for (i = incr; i < n; ++i)
             for (j = i - incr; j > -1; j -= incr)
                 if (A[j + incr] < A[j])
                    std::swap(A[j], A[j + incr]);
                 else j = -1;

}

int main(){

     sum = 0;

     scanf("%d", &n);
     for (int i = 0; i < n; ++i) scanf("%d", A + i);

     shellsort(A, n);

     for (int i = 0; i < n; ++i) printf("%d ", A[i]);

     return 0;

}


I'm using this sort sometimes, but stl sort is the easiest way. This sort is very interesting, 'cause it's rare, but it is short and fast. So, what about complexity? Three nested loops, you say $O(n^3)$ ? No, i've told it is fast :) This algo hasn't $\Theta$ complexity, 'cause it is strongly depends on the input data itself. So, experimentally, proved:
$T(n) = \Omega(n log n)$
$T(n) = O(n \sqrt n)$
and at average it is closer to $\Omega$

Lets discuss recursion. If you could expand it in a simple iteration then you'll have no problems with complexity of this algorithm. But sometimes it is impossible. Lets back to sorting task, it could be done with mergesort:



#include <cstdio>

#define maxn 1000100

/* mergesort(array, left-range, right-range); sorts array A in range [l; r) */
/* returns number chaoses */
int mergesort(int *, int, int);
/* merge(array, left, right); merges two sorted parts */
/* returns number chaoses */
int merge(int *, int, int);

int n, A[maxn], T[maxn];

int main(){
 
 scanf("%d", &n);
 for (int i = 0; i < n; ++i) scanf("%d", A + i);
 
 int chaos = mergesort(A, 0, n);
 
 for (int i = 0; i < n; ++i) printf("%d ", A[i]);
 printf("\n chaos number: %d\n", chaos);
 
 return 0;
 
}

int mergesort(int * A, int l, int r){
 
 /* one element is sorted already */
 if (l >= r - 1)
  return 0; /* one element can't create a chaos */
 
 int h = (l + r) / 2;
 /* sort and calculate chaos number */
 return mergesort(A, l, h) + mergesort(A, h, r) + merge(A, l, r);
 
}

/* T - buffer */
int merge(int * A, int l, int r){
 
 int i, j, k, h = (l + r) / 2, res = 0;
 
 for (i = l, j = h, k = l; i < h && j < r; ){
  
  if (A[i] < A[j])  /* current pair in right order */
   T[k++] = A[i++]; /* copy it, and increase k - end pointer and i - left pointer */
  else {
   
   T[k++] = A[j++]; /* copy it, and increase k - end pointer and j - right pointer */
   res += h - i;  /* A[j] creates chaoses with A[i], A[i + 1], .. , A[h - 1] */
   
  }
  
 }
 
 /* it's possible, that one part depleted, but in other there are some elements */
 /* just copy them to end (they are sorted and they are maximal) */
 for (; i < h; ++i) T[k++] = A[i++];
 for (; j < r; ++j) T[k++] = A[j++];
 
 /* copy result from buffer to our array */
 for (i = l; i < r; ++i) A[i] = T[i];
 
 return res; /* return chaos number for this merge */
 
}



OK, that's a bit long code. Its logic in merging two already sorted arrays, so, if we use this sort on this parts, and then divide it again and again, until each part has only one element (one element sorted as a definition), we could start to merge this mess, one with one, then two with two, four with four, etc. This method called generally "divide and conquer". So, lets look to code of mergesort, it calls itself two times with data size divided by 2, and call merge one time, when r - l <= 1 (one element case), it returns, so:
$T(1) = O(1)$
$T(n) = 2 T(n / 2) + Tm(n)$
where $Tm(n)$ - run-time of merge, obviously:
$Tm(n) = O(n)$
When solving it, we find that at each recursion level sum of all $O$ would be $O(n)$, but n is dividing each time by 2 until it equals to 1, so number of recursion levels is $log n$, so:
$T(n) = O(n log n)$
In additional, this algorithm allows us to find chaos number. Another algorithm for that is bubblesort $T(n) = O(n ^ 2)$, so mergesort is better for that purpose, but this sort needs twice more memory.

This algorithms was polynomial, but there are NP-complete algorithms. It couldn't be solved without check of all possible solutions or a comparable number of solutions (some kind of optimizations). Any NP-complete problem can be reduced to any another NP-complete problem.
Usually complexity of NP-complete algorithms is:
$T(n) = O(n!)$ - brute force
$T(n) = O(2 ^ n n)$ - dynamic
$T(n) = O(3 ^ n n ^ 2)$ - another dynamic
etc.

Phew, that's all for today.

If you have some ideas on how to improve topic, or you have noticed a mistake, feel free to comment or mail me waterlink000@gmail.com

Best regards
Good luck :)

No comments:

Post a Comment