**Maximum path sum I**

**Problem 18**

By starting at the top of the triangle below and moving to adjacent numbers on the row below, the maximum total from top to bottom is 23.

3 7 4 2 4 6 8 5 9 3

That is, 3 + 7 + 4 + 9 = 23.

Find the maximum total from top to bottom of the triangle below:

75 95 64 17 47 82 18 35 87 10 20 04 82 47 65 19 01 23 75 03 34 88 02 77 73 07 63 67 99 65 04 28 06 16 70 92 41 41 26 56 83 40 80 70 33 41 48 72 33 47 32 37 16 94 29 53 71 44 65 25 43 91 52 97 51 14 70 11 33 28 77 73 17 78 39 68 17 57 91 71 52 38 17 14 91 43 58 50 27 29 48 63 66 04 68 89 53 67 30 73 16 69 87 40 31 04 62 98 27 23 09 70 98 73 93 38 53 60 04 23

**NOTE:** As there are only 16384 routes, it is possible to solve this problem by trying every route. However, Problem 67, is the same challenge with a triangle containing one-hundred rows; it cannot be solved by brute force, and requires a clever method! ;o)

So another interesting problem. Reading for the first time it made me think to take the same approach as warned in the note section to not take which is calculate the sum of for every path and then find the maximum between them. However, as mentioned in the note section of the question there is also problem 67 which has 100 rows and it will take almost infinite time to solve it if we use the above method. Also, the programs should always be scalable enough. I was sure that I want to write a code which solves both the problem at one go. There is no point solving the same problems at different time with different codes.

I was trying to think of various ways I could solve this problem. My mind was kind of blocked at the same solution and could not think of any other solution. I then gave up on finding the solution and started thinking how should I store the values in them, which is the data structure that I should use. I was thinking of graph maybe because with graphs it has the parent and child concept which could be useful here as here for every node (except the leaf node) there were two children which should be considered, so somehow I need to maintain the relationship. However, it would be a little expensive to store graphs. Besides, I was not even sure if this can be done using R. I have heard of the `igraph`

package which can be used here but have never used it before. So maybe with `igraph`

or go to the traditional C, C++ structures.

One way, I could think of was to start from the top, take the maximum of its children and move to next children. So for the example shown, we start from 3, take the maximum of 7 and 4 which is 7, then go to 7’s children which is 2 and 4, select 4 and then select maximum from 5 and 9 which is 9. So we now have 3 + 7 + 4 + 9 = 23 which is the answer. Although this gave us the correct answer I was pretty sure this is not the correct approach as it misses so many important numbers which are not even considered. This worked for this case because it was a small example. Just to confirm that the approach is 100 % not working I used it on the bigger 10 X 10 dataset and calculated the sum manually and checked the answer which did not come correct.

Googled the question and the first link gave me the approach of this to be solved which was more than enough. The link does not contain the answer in any language it is just an approach. Now, the part remaining was to code the above requirement and I was still left with the original question of how to store the values. I thought of storing it in matrix with n X n values where n are the number of levels/rows in the pyramid. I could fill up the empty values with NA’s. It was possible to get their respective children’s easily as the structure is very descriptive and follow a specific sequence. After thinking of the pattern which they follow, i thought why do I even need a matrix for this. If they already follow a specific pattern , they should also follow the same pattern in the vector and there is no need of NA’s to fill in.

get_max_value_from_pyramid = function(vec, noOfRows) { #Repalcing \n with empty spaces vec = gsub("\n", " ", vec) reverseVec = rev(unlist(strsplit(vec, " "))) for( i in noOfRows:2) { vecIndex = tail(seq(sum(seq(noOfRows, i)), sum(seq(noOfRows,i-1))), -1) outputList = mapply(function(value, ind) { reverseVec <<- replace(reverseVec, ind, as.numeric(value) + max(as.numeric(reverseVec[ind - i]), as.numeric(reverseVec[ind - i + 1]))) },reverseVec[vecIndex], vecIndex) maxOutputList = max(as.numeric(outputList)) } return(maxOutputList) }

To explain the code, let’s start from the beginning. There are two parameters which need to be send as input to the function. One is the pyramid passed as string and other is the number of rows/levels in the pyramid. The benefit of sending the pyramid as string is that you can just copy paste the pyramid from the question itself and it will run directly. Now, when we copy-paste there are some line breaks which are encountered in the text as “\n”, so first thing is I remove those line break characters and replace it with empty characters. Then as we need to perform the bottom up approach instead of top down one, we need to reverse the entire vector. Right now, all the numbers are in one string, so separating them as different numbers using `strsplit `

.

So if you have not read the above stakoverflow link , let me explain you the approach which we are trying to follow here. In the example shown of 4 rows, we will start from second last row, add 2 with 8 and 5 respectively and keep the one which is maximum. so here we do 2 + 8 > 2 + 5 so we keep 10. Going forward we do 4 + 5 < 4 + 9 so we keep 13 and 6 + 9 > 6 + 3, so we keep 15. Now our pyramid would look like:

3 7 4 10 13 15

Continuing further, we reduce it to

3 20 19

and then finally we only have one number

`23 `

The example shown is for illustration purpose. In reality we are not deleting any element in the original array, we are just incrementally replacing values in them. So what happens actually is,

Original array – 3 7 4 2 4 6 8 5 9 3

Reversed array – 3 9 5 8 6 4 2 4 7 3

After first iteration – 3 9 5 8 10 13 15 4 7 3

After second iteration – 3 9 5 8 10 13 15 20 19 3

After third iteration – 3 9 5 8 10 13 15 20 19 23

So this is how we get 23 as answer.

Also, I have read mostly in R, one should not use the global assignment operator ( <<- ) as it can have lot of side effects. If we want to use the same variable again which you are changing in the function, it might get corrupted with that global assignment operator. Here I am not using the original array again in the function and moreover this works fine for my example. One another option I was thinking was of using recursion, but recursion is too expensive. It would work fine with 15 levels but it would be computationally too expensive for problem 67 with 100 levels.

vec = "3 7 4 2 4 6 8 5 9 3" get_max_value_from_pyramid(vec, 4) #[1] 23 get_max_value_from_pyramid(vec1, 15) #[1] 1074 get_max_value_from_pyramid(vec2, 100) #[1] 7273