Eureka!!: Segment Tree

Thursday, July 21, 2011

Segment Tree

Segment trees (for beginners)
--------------------------------------

Lets start with a simple problem.

Given an array of numbers. There are 2 types of operations to do.
1. update any element in the array
2. find the maximum/minimum in the given range (i,j)

Simple solution.
-- updating is no problem. a[i]=x; :D
-- finding maximum/minimum can be done in one for loop of (j-i+1) iterations.

Can we do any better?
The answer is yes.
In fact we can do both the operation in O(log n) where n is the size of the array.

That's when segment tree comes handy.

A Segment Tree is a binary tree.
For the sake of convinience, lets take an array of size N = 2^n.

Suppose we have some work to do. If one person does it all, a lot of effort/time is needed.
So to reduce the time as well as effort we can divide the responsibilities among many people.

We can divide the responsibilities in Segment Trees as well.
Lets distribute the responsibilities to different nodes as follows.
-> each node carries the maximum and/or minimum value of all its childrens.

For example:
Suppose we have an array of size 8.

Let us construct a complete binary tree (segment tree) as below.

In a segment tree the list of leaf nodes represents the actual array where the main data resides. Here there are 8 leaf nodes which correspond to the elements of array we wish to update or find max/min in the range.

Each node stores the required information (here maximum/minimum) assigned to them.
for eg: The root node stores maximum/minimum of all the nodes from node 1 to node 8.

Assume such tree exists. Lets walk through how we process the following queries.

1. Find maximum in range (1,8)

Return the answer stored in the root of the tree.

2. Find maximum in range (3,6)

check root(1-8) ---(our range is a subset of this range) so branch in.
check (1-4) hmm..contains partially
check (1-2) out of range...dont check its children.
check (3-4) hmm..completely inside the range...take the value..dont go to children(since it lies completely inside the range (3,6) ).
check (5-8) hmm...contains our answer partially
check (5-6) completely inside our range. so take its info. dont branch in.
check (7-8) completely outside our range. dont branch in.

Now finally compare all the values we took in the process.
Remember we took the values of range(3-4) and (5-6)
Just compare them and we have the maximum.

Here we visited none of the leaf nodes. We visited some internal nodes only.
In practice if the segment tree is very large then we skip too many nodes.
So effectively the number of nodes examined will be of order of O(height) the tree.
which is log N.

***********Happy Coding*******************

Let me give you how to build the tree for finding maximum in a range.
A binary tree can be represented by an array. For convenience I have named the root as 1. The 2 children of a node n are 2*n and 2*n + 1.

If you have an array 'arr' of size 1000, then call build function as below.
build (1, 0, 999);

int tree[3*MAX]; //a segment tree is nearly twice as large as the array.
int arr[MAX];

void build(int n, int b, int e)
{
if (b>e) return;
else if (b==e)
{
tree[n] = arr[b];
return;
}

build ( n*2 , b , (b+e)/2 ); //go to children...child nodes of node n are 2n and 2n+1.

build (n*2+1, (b+e)/2 + 1 , e );

//now both child nodes 2n and 2n+1 are built (ie they have done their responsibility of storing the correct information)
tree[n] = max( tree[n*2] , tree[n*2 + 1] );

}

/*
* update in the tree at index idx with value val.
* remember here n is the node number of the tree and not index of array...value of root node 1 and its children are 2 and 3
* idx is the mapping of the leaf nodes to array. when b==e we reached leaf node
*/
void update(int n, int b, int e, int idx, int val)
{
if (b>e || b>idx || e<idx ) return;

if (b==e) //at leaf node
{
tree[n] = val;
return;
}

update( n*2 , b , (b+e)/2 , idx, val );
update( n*2 + 1 , (b+e)/2 + 1 , e , idx, val );

//now some change might have been made in either of the child nodes.

tree[n] = max( tree[n*2] , tree[n*2 + 1] );

}

//query function is homework.

int query(int n, int b, int e, int i, int j) //return the maximum in the range (i,j)
{
...
...
}

/******************************************************************************/
/******************************************************************************/
/******************************************************************************/

The problem of finding maximum in the range requires a little more effort.

Reason:

-> lazy propagation updates only at top level nodes.

-> each node can either store increments done to the range or only the maximum.

having only one of these is not sufficient.

Thus we need to store both the information.

store 2 variables. one is maximum of the child nodes and another is total increment done to the node.

the actual maximum number at each node is

maximum number stored in a node + total increments to the node and all its parents.

Below is full source code of the program.

#include < cstdio >
#include < iostream >
using namespace std;

#define INF 1e9

struct Node {
int offt;
int cmax;
} tree[5000];

void update(int n, int b, int e, int i, int j, int val)
{
if (b>e || i>j || b>j || e

if (b>=i && e<=j)
{
tree[n].offt += val;
tree[n].cmax += val;
return;
}

update(n*2 , b , (b+e)/2 , i , j , val);
update(n*2+1 , (b+e)/2 + 1 , e , i , j , val);

tree[n].cmax = max ( tree[n*2].cmax , tree[n*2+1].cmax ) + tree[n].offt;
}

int query(int n, int b, int e, int i, int j, int offt)
{
if (b>e || i>j || b>j || e

if (b>=i && e<=j)
return tree[n].cmax + offt; //the increment of current node is already added to cmax here (see update function)

offt += tree[n].offt;
return max ( query(n*2 , b , (b+e)/2 , i , j, offt) ,
query(n*2+1 , (b+e)/2 + 1 , e , i , j, offt) );
}

int main()
{
int tc,x,a,b,v;
cin >> tc;
while(tc--)
{
cin >> x >> a >> b;
if ( x == 0 )
{
cin >> v;
update(1,0,999,a,b,v);
}
else
cout << query(1,0,999,a,b,0) << endl;
}
}

**************************************************************************

36 comments:

geekJuly 21, 2011 at 5:00 AM
plz xplain a bit more how update is implemented
ReplyDelete
Replies
geekJuly 21, 2011 at 6:49 AM
This comment has been removed by a blog administrator.
ReplyDelete
Replies
AshishJuly 21, 2011 at 9:02 AM
small typing error. now corrected.
ReplyDelete
Replies
geekJuly 21, 2011 at 10:20 AM
what is the use of idx here in update function??
ReplyDelete
Replies
AshishJuly 21, 2011 at 1:11 PM
idx maps to the index in the array.
basically (b,e) is a range represented by node n
so when b=e we reach the leaf node corresponding to index b (which at that time also equals idx) of the array.
ReplyDelete
Replies
shadowJuly 22, 2011 at 11:40 PM
in 2nd prob. in update section ....shouldn't the line "tree[n]+=val" be "tree[n]+=(e+1-b)*val"...
ReplyDelete
Replies
UnknownJuly 29, 2011 at 10:08 PM
sir,in the second illustration you gave just after the tree diagram....i think you have mistyped the query

it should have been
find maximum in range (3,6)
ReplyDelete
Replies
geekJuly 31, 2011 at 1:57 AM
int query(int n,int b,int e,int i,int j)
{
if(i>e || j=i && e<=j) return tree[n];

int p1=query(2*n,b,(b+e)/2,i,j);
int p2=query(2*n+1,(b+e)/2+1,e,i,j);

if(p1==-1) return p2;
if(p2==-1) return p1;
if(p1>p2) return p1;
else return p2;

}
ReplyDelete
Replies
geekJuly 31, 2011 at 1:59 AM
sir how to impement this with lazy propogation...................
plz xpain a bit more.... don post the code for now
ReplyDelete
Replies
mad_coderJanuary 24, 2012 at 1:26 AM
Sir, i have implemented the first query function as in the code given below but with extra space complexity. Is there a way to reduce the extra space complexity ?
int arr[MAX],tree[2*MAX],max_arr[2*max];
int query(int n,int b,int e,int i,int j)
{
if(b>e||b>j||e=i&&e<=j)
return tree[n];
max_arr[2*n]=query(2*n,b,(b+e)/2,i,j);
max_arr[2*n+1]=query(2*n+1,(b+e)/2+1,e,i,j);
return max(max_arr[2*n],max_arr[2*n+1]);
}
ReplyDelete
Replies
NitinFebruary 12, 2012 at 9:33 AM
Nice tutorial man. Really helpful :)
ReplyDelete
Replies
CSLAugust 2, 2012 at 11:12 AM
where is the full code sir
ReplyDelete
Replies
UnknownAugust 4, 2012 at 3:44 PM
Hi Ashish Pant, the update function doesn't seem to work properly.
I wonder if you could explain what is `b` and `e`? Is it the full range of the data array? If I have an array data[10], then will update is called as update(1, 0, 10, left, right, some-value)? Thanks.
ReplyDelete
Replies
swatAugust 11, 2012 at 7:42 AM
thanks dude. nice code.
ReplyDelete
Replies
Atul anandSeptember 23, 2012 at 10:42 AM
Hi could you please explain the following :-
struct Node {
int offt;
int cmax;
} tree[5000];

what is offt and cmax holding ?
when i initalize the tree , then what value should be stored at offt and cmax?
in the query function query(1,0,999,a,b,0);// last parameter is offt , what is the use of this parameter.

Thanks in advance
ReplyDelete
Replies
karimNovember 12, 2012 at 12:01 PM
hi ashish i really like your tutorial specially the part with lazy propagation but i noticed that the full source code for the program is not entirely correct (not the missing parts) but in the update function

1 - update(n*2 , b , (b+e)/2 , i , j , val);
2 - update(n*2+1 , (b+e)/2 + 1 , e , i , j , val);
3 - tree[n].cmax = max ( tree[n*2].cmax , tree[n*2+1].cmax );

the 3rd line i think we should add the tree[n].offt to the maximum as this maximum is not propagated so it would give incorrect results if we query this exact interval again
ReplyDelete
Replies
RavinNovember 13, 2012 at 3:21 PM
Hi, could you please provide query function for range modification and range sum, because I don't get it.

For example we got array with size 10 and two queries:
1) add to whole array 3
2) get sum from element 4 to 5 inclusive

According to your code we'll call update(1, 0, 9, 0, 9, 3) and it will update first node and break; so question is how do we implement sum function here using nlog time? thanks :)
ReplyDelete
Replies
Panagiotis KostopanagiotisJanuary 2, 2013 at 7:11 AM
nice one.. pretty straight forward article with nice explanations.
Segment tree is a really important data structure for programming competitions , and articles like this are really helpful!

Thanks :)
ReplyDelete
Replies
Viki SharmaFebruary 7, 2013 at 8:55 PM
Wonderful blog! Really useful :)
I had a doubt with a particular question involving segment trees:
http://www.spoj.com/problems/IOPC1207/
What I can't understand is why we can use 3 independent segment trees. If we update the x range say x1 to x2, won't the corresponding segment trees on y and z axis also get affected?
ReplyDelete
Replies
MayankMay 9, 2013 at 3:56 AM
really helpful blog. Loved how simple the explanation was. I had a doubt though. If i need to do a query for a range and find both the min and max in that range what should be done? I made two trees with one build function: one containing the maximums and the other containing minimums. But i cant seem to figure out how to do both the jobs with one query function i am having to call two separate functions one for maximum and one for minimum.

Thanks
ReplyDelete
Replies
MayankMay 11, 2013 at 8:52 AM
ya got that...silly on my part :D
thanx a lot!
ReplyDelete
Replies
Lone RangerDecember 8, 2013 at 9:11 AM
Dude, you're a life saver. Been looking all over on how to query the goddamn tree and it finally makes sense!.
Thanks a ton!
ReplyDelete
Replies
UnknownDecember 18, 2014 at 1:23 AM
we have to take array size of 2^n for make segment tree of n element array,then how to make segment tree of 30 or greater elements,because 2^30 = 1073741824
then how to take array of this larger size for make segment tree,how to implement ?
ReplyDelete
Replies
Nidhi MakhijaniFebruary 15, 2015 at 2:44 AM
great post!
ReplyDelete
Replies
UnknownSeptember 12, 2016 at 7:49 PM
how to do update on interval using lazy:
update func:
update all from l to r with their divisors
ReplyDelete
Replies